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Abstract 

Synthetic  aperture  radar  (SAR)  uses  relative  motion  to  produce  fine  resolution 
images  from  microwave  frequencies  and  is  a  useful  tool  for  regular  monitoring  and 
mapping  applications.  Unfortunately,  if  target  distance  is  estimated  poorly,  then 
phase  errors  are  incurred  in  the  data,  producing  a  blurry  reconstruction  of  the  im¬ 
age.  In  this  thesis,  we  introduce  a  new  multistatic  methodology  for  determining 
these  phase  errors  from  interferometry-inspired  combinations  of  signals.  To  moti¬ 
vate  this,  we  first  consider  a  more  general  problem  called  phase  retrieval,  in  which 
a  signal  is  reconstructed  from  linear  measurements  whose  phases  are  either  unreli¬ 
able  or  unavailable.  We  make  significant  theoretical  progress  on  the  phase  retrieval 
problem,  to  include  characterizing  injectivity  in  the  complex  case,  devising  the  the¬ 
ory  of  almost  injectivity,  and  performing  a  stability  analysis.  We  then  apply  certain 
ideas  from  phase  retrieval  to  resolve  phase  errors  in  SAR.  Specifically,  we  use  bistatic 
techniques  to  measure  relative  phases,  and  then  we  apply  a  graph-theoretic  phase 
retrieval  algorithm  to  recover  the  phase  errors.  We  conclude  by  devising  an  image 
reconstruction  procedure  based  on  this  algorithm,  and  we  provide  simulations  that 
demonstrate  stability  to  noise. 


Keywords :  Synthetic  aperture  radar,  phase  retrieval,  angular  synchronization,  phase 
errors,  circulant  graphs,  informationally  complete,  quantum  mechanics,  unit  norm 
tight  frames,  computational  complexity,  Cramer-Rao  lower  bound 


IV 


Acknowledgements 


I  would  like  to  personally  thank  Matthew  Fickus  and  Jesse  Peterson,  both  of 
whom  significantly  influenced  the  development  of  this  thesis  with  their  thoughts  and 
insights.  Others  deserving  of  recognition  include  Afonso  Bandeira,  Jameson  Cahill 
and  Yang  Wang  for  their  contributions  to  the  ideas  presented  in  Chapters  II,  III 
and  IV.  I  extend  my  sincerest  thanks  to  my  research  advisor,  Dustin  Mixon,  whose 
constant  motivation  and  drive  were  pivotal  in  the  completion  of  this  document;  his 
attitude  is  infectious  and  his  patience  truly  endless.  Moreover,  his  constant  guidance 
and  instruction  have  made  my  time  at  AFIT  one  of  the  most  rewarding  experiences 
of  my  academic  career.  I  am  grateful  to  have  had  the  opportunity  to  work  with  him 
and,  most  importantly,  learn  from  him. 


This  research  was  supported  by  NSF  Grant  No.  DMS-1321779. 

SAR  image  of  Pentagon  courtesy  of  Sandia  National  Laboratories  [83] 

Aaron  A.  Nelson,  2d  Lt,  USAF 


v 


Table  of  Contents 

Page 

Abstract .  iv 

Acknowledgements .  v 

List  of  Figures  .  vii 

I.  Introduction .  1 

1.1  Synthetic  aperture  radar .  1 

1.2  The  phase  retrieval  problem .  5 

1.3  Overview .  9 

II.  Injective  intensity  measurements  and 

the  4 M  —  4  conjecture  .  13 

2.1  Injectivity  and  the  complement  property .  13 

2.2  Towards  a  rank-nullity  theorem  for  phase  retrieval  .  20 

2.3  Achieving  injectivity  with  AM  —  4  intensity  measurements .  32 

III.  Almost  injective  intensity  measurements  and 

the  computational  limits  of  phase  retrieval .  49 

3.1  Almost  injectivity  .  49 

3.2  The  computational  complexity  of  phase  retrieval  .  58 

IV.  The  stability  of  phase  retrieval .  67 

4.1  Stability  in  the  worst  case .  67 

4.2  Stability  in  the  average  case .  80 

V.  The  phase  error  problem  in  synthetic  aperture  radar .  86 

5.1  Synthetic  aperture  radar .  86 

5.2  Angular  synchronization .  97 

5.3  Formulating  the  phase  error  problem  with  graphs . 102 

5.4  Solving  the  phase  error  problem  in  the  noisy  case . 113 

VI.  Conclusion . 124 

Appendices . 127 

Bibliography  . 148 

vi 


List  of  Figures 

Figure  Page 

1  Classical  airborne  spotlight-mode  SAR .  3 

2  The  simplex  in  M3 .  58 

3  Bounds  on  stability  for  phase  retrieval  from  random  measurements  79 

4  Multistatic  SAR  to  extract  combinations  of  phase  errors .  96 

5  Formulating  the  phase  error  problem  with  graphs . 103 

6  Iterative  angular  synchronization  error  versus  number  of  cycles 

with  noise . 120 

7  Iterative  angular  synchronization  error  versus  input  noise  with  vary¬ 
ing  cycles . 121 

8  Modulation  recovery  and  SAR  image  reconstruction  verses  input 

noise  . 122 

9  Examples  of  SAR  image  reconstruction  via  total  variation  maxi¬ 
mization  . 123 


vii 


About  Phase:  Synthetic  Aperture  Radar  and 
the  Phase  Retrieval  Problem 

I.  Introduction 

1.1  Synthetic  aperture  radar 

Synthetic  aperture  radar  (SAR)  is  a  form  of  radar  that  uses  relative  motion  to 
produce  fine  resolution  images  from  microwave  signals.  The  usefulness  of  SAR  stems 
from  its  ability  to  overcome  the  shortcomings  of  competing  remote  imaging  systems. 
For  instance,  its  day-or-night  and  all-weather  capabilities  give  SAR  an  advantage 
over  both  optical  cameras  and  infrared  imagers  while  maintaining  comparable  spatial 
resolution  [48].  As  such,  SAR  is  a  particularly  useful  tool  for  regular  monitoring  and 
mapping  applications,  some  of  which  include  the  following: 

Reconnaissance  and  surveillance.  SAR  imaging  enables  constant  reconnais¬ 
sance  and  surveillance,  as  it  can  operate  at  any  time  of  day  and  in  all  weather  condi¬ 
tions,  while  offering  sufficient  resolution  to  distinguish  terrain  features  and  identify 
man-made  targets.  Even  moving  targets  may  be  identified,  and  so  SAR  is  capable  of 
monitoring  traffic  patterns  or  tracking  the  movement  of  personnel  and  vehicles  [2,74], 

Topography.  With  the  help  of  certain  interferometric  techniques,  SAR  can  be 
used  to  create  accurate  topographic  maps  and  surface  profiles.  The  extremely  high 
resolution  of  these  techniques  also  enables  detection  of  sudden  seismic  activity  and 
even  volcanic  bulging  prior  to  the  eruption  of  volcanoes  [2,48,71]. 

Navigation  and  guidance.  All-weather,  autonomous  navigation  and  guidance 
may  be  accomplished  using  SAR  by  periodically  imaging  the  surrounding  terrain  and 
comparing  to  a  stored  reference  image.  This  comparison  then  provides  a  means  for 
navigation  update,  and  can  even  be  used  to  accurately  guide  aircraft  or  munitions 
to  a  target  [74], 
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Foliage  and  ground  penetration.  Due  to  its  use  of  microwave  frequencies,  SAR 
offers  the  capability  of  penetrating  optically  opaque  materials,  such  as  foliage  and 
topsoil.  Thus,  SAR  enables  monitoring  of  activity  normally  hidden  by  trees,  brush, 
or  similar  ground  cover.  Depending  on  soil  conditions,  SAR  is  also  capable  of  imaging 
underground  targets  of  sufficient  size  at  depths  of  up  to  several  meters  [2,74], 

Environmental  monitoring.  SAR  is  particularly  sensitive  to  the  dielectric  prop¬ 
erties  of  materials,  making  it  useful  for  monitoring  the  condition  of  vegetation.  Thus, 
it  is  an  important  agricultural  and  environmental  tool,  capable  of  accurately  moni¬ 
toring  crop  characteristics,  soil  moisture  levels,  deforestation,  ice  flows,  and  oil  spills. 
In  particular,  SAR  is  effective  at  detecting  oil  spills  over  open  water  due  to  certain 
backscatter  effects  [2,74], 

SAR  works  by  implementing  a  moving  radar  platform  that  repeatedly  transmits 
a  certain  type  of  microwave  signal  and  records  the  return  signal  reflected  by  the  scene 
of  interest.  Typical  platforms  used  for  SAR  imaging  include  aircraft  and  satellite, 
although  each  platform  presents  its  own  challenges  during  reconstruction  [48].  Since 
the  radar  source  is  in  motion  (relative  to  the  target),  repeatedly  imaging  a  scene 
provides  information  from  a  continuously  changing  perspective,  and  it  is  precisely 
this  introduction  of  perspective  to  the  system  that  enables  image  reconstruction  with 
increased  resolution. 

In  airborne  spotlight-mode  SAR,  appropriate  assumptions  regarding  the  trans¬ 
mitted  signal  (e.g.,  assumptions  relating  its  frequencies  to  the  speed  of  light)  along 
with  assumptions  about  the  scene  (e.g.,  that  elevations  are  relatively  constant,  so 
the  desired  image  is  simply  a  function  over  M2)  enable  the  return  signal  to  be  inter¬ 
preted  in  terms  of  the  Fourier  transform  of  the  target  image.  Indeed,  under  these 
assumptions,  the  signal  reflected  back  to  the  radar  source  is  the  transmitted  signal 
multiplied  by  a  unit-modulus  phase  factor  uj,  and  pointwise  multiplied  by  a  pre¬ 
dictably  modulated  version  of  a  one- dimensional  slice  of  the  Fourier  transform  of 
the  desired  reflectivity  function  p :  M2  — >■  M,  which  describes  certain  electromagnetic 
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(a)  (b)  (c) 

Figure  1:  (a)  Classical  airborne  spotlight-mode  synthetic  aperture  radar  (SAR).  The  aircraft  transmits  a  signal  and 
receives  a  version  of  that  signal  which  encodes  a  portion  of  the  Fourier  transform  of  the  desired  image,  (b) 
Based  on  the  aircraft’s  current  position,  it  obtains  the  depicted  slice  of  the  Fourier  transform,  (c)  After 
obtaining  a  range  of  slices  of  the  image’s  Fourier  transform,  the  slices  are  interpolated  before  inverting 
the  Fourier  transform.  Unfortunately,  if  the  distance  between  the  aircraft  and  the  scene  of  interest  is 
estimated  poorly,  then  a  phase  error  is  incurred  in  the  corresponding  slice.  Different  phase  errors  for 
different  slices  accumulate  to  produce  a  blurry  reconstruction  of  the  desired  image.  Such  phase  errors 
are  typically  estimated  and  removed  using  various  post-processing  techniques,  and  while  these  tend  to 
work  rather  well,  they  are  often  ad  hoc,  requiring  additional  assumptions  about  the  target  scene,  and  they 
sometimes  fail  unexpectedly.  One  of  the  contributions  of  this  thesis  is  the  introduction  of  a  new  multistatic 
methodology  for  determining  these  phase  errors  from  interferometry-inspired  combinations  of  signals. 


characteristics  of  the  scene  [48]  (cf.  Fact  5.1  in  this  thesis).  Specifically,  if  one  trans¬ 
mits  and  receives  a  signal  from  a  point  (x,  y)  to  image  a  scene  which  is  centered  at  the 
origin,  then  the  received  signal  encodes  the  portion  of  the  scene’s  two-dimensional 
Fourier  transform  Fp  that  lies  on  the  line  which  passes  through  both  (x,  y )  and 
the  origin  (see  Figure  1(a)  and  (b)  for  an  illustration).  In  practice,  such  portions 
are  interpolated  to  reconstruct  the  entire  two-dimensional  Fourier  transform,  from 
which  the  image  may  be  easily  obtained  [48].  Unfortunately,  an  issue  arises  when 
using  this  approach,  namely,  uncertainty  in  the  target  distance  (that  is,  the  distance 
from  the  radar  source  to  the  target  scene).  Indeed,  the  phase  factor  u  which  appears 
in  the  received  signal  is  a  sensitive  function  of  target  distance.  Furthermore,  small 
fluctuations  in  this  distance  are  quite  common  due  to  factors  such  as  aircraft  per¬ 
formance,  weather,  wind,  and  pilot  skill  [18].  Overall,  any  noise  in  the  estimates  of 
these  distances  creates  phase  errors  in  the  recorded  signals,  and  since  the  phase  error 
will  be  different  for  each  slice  of  the  Fourier  transform,  the  image  becomes  distorted 
when  taking  the  inverse  Fourier  transform. 
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The  effect  of  phase  errors  is  pointwise  multiplication  in  the  Fourier  domain, 
meaning  the  desired  image  is  blurred  in  the  spatial  domain,  typically  enough  so 
that  objects  of  interest  within  the  target  scene  are  indiscernible  (see  Figure  1(c), 
for  example).  Although  there  are  methods  for  dealing  with  phase  errors  during 
post-processing  (e.g.,  autofocus  algorithms  [63]),  and  many  of  these  certainly  pro¬ 
duce  outstanding  results,  it  is  desirable  to  eliminate  the  problem  prior  to  image 
reconstruction.  Indeed,  many  algorithms  for  correcting  phase  errors  use  ad  hoc 
techniques,  require  further  assumptions  on  the  target  scene,  and  may  fail  unexpect¬ 
edly  [37,44,54,79], 

It  is  reasonable  to  expect  that  the  phase  error  problem  in  SAR  could  have  a 
more  systematic  solution  if  only  additional  signal  data  were  available  for  analysis. 
The  desire  for  more  information  motivates  the  use  of  multistatic  radar  systems,  in 
which  multiple  radar  sources,  separated  by  distances  comparable  to  any  single  target- 
to-source  distance,  are  capable  of  both  transmitting  and  receiving  signals  reflected 
by  a  common  target.  These  systems  enable  multiple  measurements  to  be  taken 
from  varying  perspectives,  and  when  combined,  these  additional  measurements  of¬ 
ten  improve  resolution  and  even  combat  some  of  the  weaknesses  of  monostatic  radar 
systems  [71].  To  date,  multistatic  techniques  have  been  used  for  various  applica¬ 
tions,  such  as  tracking  and  triangulation,  using  both  stationary  and  mobile  radar 
sources  [59,64,71].  For  instance,  antenna  “swarms”  are  a  common  airborne  ap¬ 
plication  of  multistatic  radar  for  target  tracking  in  which  multiple  radar  platforms 
with  independent  flight  paths  are  capable  of  both  transmitting  and  receiving  sig¬ 
nals  to  and  from  a  (possibly  moving)  target  [13,36,71].  One  way  of  realizing  these 
swarms  is  to  mount  radar  receivers  on  a  team  of  remotely  piloted  aircraft  (RPAs); 
in  such  multistatic  systems,  a  common  radar  source  transmits  a  signal  to  a  target 
while  each  RPA  records  time-delay  and  Doppler  measurements  of  the  reflected  signal 
which,  when  combined,  provide  target  tracking  that  has  been  shown  to  outperform 
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traditional,  static  radar  arrays  [13,33].  Even  passive  sources  like  radio  and  television 
broadcast  signals  can  be  incorporated  in  a  multistatic  system  [15,59,64], 

The  techniques  of  monostatic  SAR  described  earlier  can  be  naturally  extended 
to  the  bistatic  and  multistatic  settings  [48],  As  we  will  demonstrate,  the  introduction 
of  additional  radar  transmitters  and  receivers  allows  one  to  observe  interferometry- 
inspired  combinations  of  the  phase  errors  we  seek  to  remove.  We  will  then  apply 
ideas  from  a  related,  well-studied  problem  called  phase  retrieval  to  estimate  the 
phase  errors  from  these  combinations. 

1.2  The  phase  retrieval  problem 

The  phase  error  problem  in  SAR  can  be  viewed  as  an  instance  of  a  more 
general  problem  called  phase  retrieval,  in  which  one  attempts  to  reconstruct  a  signal 
when  phase  information  is  either  unreliable  (as  in  the  case  of  SAR)  or  completely 
lost  during  some  linear  measurement  process.  Indeed,  given  slices  of  the  Fourier 
transform,  each  multiplied  by  a  different  unknown  phase  factor,  one  can  simply  ignore 
any  phase  information  by  taking  pointwise  absolute  values,  effectively  reducing  the 
phase  error  problem  in  SAR  to  the  most  common  problem  in  phase  retrieval:  recover 
an  image  from  the  pointwise  absolute  value  of  its  Fourier  transform.  This  reduction 
implies  that  any  method  of  phase  retrieval  is  also  a  solution  to  the  phase  error 
problem. 

We  note  that  phase  retrieval  is  interesting  in  its  own  right,  as  it  has  many 
applications  other  than  SAR: 

Coherent  diffractive  imaging.  A  common  technique  for  imaging  a  nanoscale 
object  is  to  strike  it  with  a  highly  coherent  beam  of  X-rays  and  record  the  resultant 
diffraction  pattern  using  a  photon  counting  device.  This  diffraction  pattern  is  the 
Fourier  transform  of  the  material  density  profile  of  the  object.  However,  counting 
photons  only  provides  the  intensity  of  the  diffraction  pattern,  and  so  recovering  the 
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image  first  requires  obtaining  the  lost  phase  information  via  phase  retrieval  [19,60, 
76,78], 

Optics.  This  application  enjoys  various  instances  of  phase  retrieval: 

(i)  In  astronomy,  imaging  celestial  objects  like  stars  using  a  lens-based  optical  sys¬ 
tem  requires  computing  the  associated  pupil  distribution  (i.e.,  the  distribution 
of  light  that  is  allowed  to  exit  the  optical  system).  Since  such  systems  only  de¬ 
tect  the  pointwise  absolute  value  of  the  Fourier  transform  of  the  pupil  function, 
one  must  first  recover  the  phases  before  building  an  image  of  the  object  [91]. 

(ii)  When  producing  a  high-resolution  image  of  a  radiating  object,  certain  interfer¬ 
ometric  techniques  can  be  used  to  approximate  the  object’s  spatial  coherence 
function,  which  is  the  Fourier  transform  of  the  object  map  (i.e.,  the  spatial 
intensity  of  the  radiation).  Unfortunately,  the  phase  of  this  function  is  quite 
difficult  (and  often  impossible)  to  estimate  accurately,  and  so  is  typically  dis¬ 
carded  in  favor  of  estimation  by  phase  retrieval  [40]. 

(iii)  Soon  after  NASA  launched  its  Hubble  Space  Telescope,  it  was  discovered  that 
its  primary  mirror  suffered  from  a  large  spherical  aberration  (i.e.,  phase  er¬ 
rors  resulting  from  light  striking  the  mirror  near  its  edge).  To  determine  the 
proper  correction,  the  extent  of  the  aberration  was  established  by  constructing 
the  pupil  function  from  the  associated  point  spread  function  (which  measures 
the  intensity  of  the  Fourier  transform  of  the  pupil  function)  using  phase  re¬ 
trieval  [55]. 

Quantum  state  tomography.  When  measuring  a  pure  quantum  state  using  a 
positive  operator-valued  measure  (POVM)  of  rank-1  elements,  the  distribution  of 
the  random  outcome  of  the  measurement  can  be  expressed  in  terms  of  the  state’s 
intensity  measurements  with  the  Parseval  frame  elements  which  generate  the  POVM. 
Since  repeated  measurements  of  the  state  produce  an  empirical  estimate  of  this 
distribution,  phase  retrieval  can  be  used  to  identify  the  state  [56,57,62], 
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Speech  processing.  In  signal  processing  for  speech  applications,  a  common 
method  of  denoising  is  to  take  the  short-time  Fourier  transform  (STFT)  and  perform 
a  smoothing  operation  on  the  magnitudes  of  the  coefficients.  Instead  of  inverting 
the  STFT  using  the  (noisy)  unaltered  phases  of  the  coefficients,  one  can  recover  the 
denoised  version  of  the  signal  by  first  discarding  the  phases  and  then  reconstructing 
with  phase  retrieval  [8,85]. 

Although  there  are  many  applications  of  phase  retrieval,  the  task  is  often  im¬ 
possible.  For  instance,  intensity  measurements  with  the  identity  basis  effectively 
discard  the  phase  information  of  a  signal’s  entries,  and  so  this  measurement  process 
is  not  at  all  injective;  similarly,  the  power  spectrum  discards  the  phases  of  Fourier 
coefficients.  This  fact  has  led  many  researchers  to  invoke  a  priori  knowledge  of 
the  desired  signal,  since  intensity  measurements  might  be  injective  when  restricted 
to  a  smaller  signal  class.  This  is  frequently  the  case  in  optics  applications,  since 
the  pupil  distribution  is  only  supported  within  the  aperture  of  the  optical  system; 
the  resultant  compact-support  constraint  is  often  sufficient  to  make  the  intensity 
measurement  mapping  injective.  The  introduction  of  such  information  has  led  to 
various  ad  hoc  phase  retrieval  algorithms,  and  while  some  have  found  success  (e.g., 
in  correcting  the  Hubble  Space  Telescope),  such  algorithms  often  fail  to  work  un¬ 
expectedly.  (The  situation  is  not  unlike  the  state  of  the  art  for  correcting  phase 
errors  in  SAR.)  Overall,  algorithms  produced  in  this  way  typically  lack  practical 
performance  guarantees. 

Thankfully,  there  is  an  alternative  approach  to  phase  retrieval,  as  introduced 
in  2006  by  Balan,  Casazza  and  Edidin  [8]:  Instead  of  restricting  to  a  smaller  signal 
class,  seek  injectivity  by  designing  a  larger  ensemble  of  measurement  vectors.  (This 
approach  is  an  underlying  theme  throughout  this  thesis.)  Unbeknownst  to  Balan  et 
al.  at  the  time,  the  quantum  mechanics  community  was  already  familiar  with  this 
idea  (for  quantum  state  tomography  [56,57]),  but  presenting  the  idea  to  the  signal 
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processing  community  led  to  a  flurry  of  research  in  search  of  practical  phase  retrieval 
guarantees  [3, 5, 7, 9, 24-27, 43, 49, 88, 90] . 

At  this  point,  it  is  helpful  to  introduce  some  notation.  Given  a  collection 
of  measurement  vectors  <f>  =  {(pn}n=i  in  V  =  MM  or  CM,  which  we  identify  with 
the  M  x  N  matrix  whose  columns  form  the  collection,  we  consider  the  intensity 
measurement  process  defined  by 

GA(*))(n)  H(^n>|2- 

For  example,  in  the  case  of  phase  retrieval  with  the  Fourier  transform,  each  tpn  is  a 
complex  sinusoid  and  <F*  is  the  Fourier  transform.  Note  that  A{x)  =  A(y)  whenever 
y  =  cx  for  some  scalar  c  of  unit-modulus.  As  such,  the  mapping  A:  V  — >  is 
not  injective.  To  resolve  this  (technical)  issue,  throughout  this  thesis  we  consider 
sets  of  the  form  V/S,  where  V  is  a  vector  space  and  S'  is  a  multiplicative  subgroup 
of  the  field  of  scalars.  By  this  notation,  we  mean  to  identify  vectors  x,  y  G  V  for 
which  there  exists  a  scalar  c  G  S  such  that  y  =  cx\  we  write  y  =  x  mod  S  to  convey 
this  identification.  Most  (but  not  all)  of  the  time,  V/S  is  either  RM/{±1}  or  CM/T 
(here,  T  is  the  complex  unit  circle),  and  we  view  the  intensity  measurement  process 
as  a  mapping  A:  V/S  —$■  it  is  in  this  way  that  we  will  consider  the  measurement 
process  to  be  injective  or  stable. 

In  order  to  perform  phase  retrieval  successfully,  we  therefore  seek  to  understand 
the  properties  of  the  measurement  ensemble  that  enable  recovery  of  a  signal  x  from 
measurements  of  the  form  A(x).  This  naturally  leads  to  the  following  question: 

The  Phase  Retrieval  Problem.  What  are  necessary  and  sufficient  conditions  for 
efficient  and  stable  recovery  of  a  signal  from  its  intensity  measurements? 

As  a  noteworthy  stride  toward  solving  the  phase  retrieval  problem,  Candes, 
Strohmer  and  Voroninski  [27]  viewed  intensity  measurements  as  Hilbert-Schmidt  in¬ 
ner  products  between  rank-1  operators,  and  they  applied  certain  intuition  from  com- 


pressed  sensing  to  stably  reconstruct  the  desired  M-dimensional  signal  with  semidef- 
inite  programming;  similar  alternatives  and  refinements  have  since  emerged  [24, 34, 
43,89].  Another  alternative  exploits  the  polarization  identity  to  discern  relative 
phases  between  certain  intensity  measurements;  this  method  uses  an  expander  graph 
along  with  a  new  algorithm  called  angular  synchronization  to  quickly  solve  certain 
instances  of  phase  retrieval,  and  it  comes  with  a  similar  stability  guarantee  [3,9]. 
One  can  also  formulate  phase  retrieval  in  terms  of  MaxCut,  and  solvers  for  this 
formulation  are  equivalent  to  a  popular  solver  (PhaseLift)  for  the  matrix  recovery 
formulation  [88,90].  In  this  same  line  of  research,  a  new  methodology  for  coherent 
diffractive  imaging  emerged  [24]:  Rather  than  attempting  phase  retrieval  with  pos¬ 
sibly  incomplete  information  taken  from  a  single  exposure,  take  multiple  exposures 
of  the  same  object  using  different  diffraction  gratings.  Such  a  process  is  capable  of 
producing  complete  information  and  is  associated  with  provably  efficient  (and  ap¬ 
parently  stable)  phase  retrieval  algorithms  [9,26].  This  approach  inspires  the  use  of 
multistatic  SAR  in  this  thesis  as  a  means  of  producing  complete  information  for  the 
phase  error  problem. 

1.3  Overview 

This  thesis  offers  two  main  contributions:  (i)  we  make  significant  theoretical 
progress  on  the  phase  retrieval  problem,  and  (ii)  we  apply  certain  ideas  from  phase  re¬ 
trieval  to  resolve  phase  errors  in  synthetic  aperture  radar.  We  begin  in  Chapter  II  by 
examining  what  it  means  for  an  ensemble  of  intensity  measurements  to  be  injective. 
In  particular,  we  discuss  the  characterization  of  injectivity  in  the  real  case  as  intro¬ 
duced  by  Balan,  Casazza  and  Edidin  [8],  i.e.,  the  complement  property ,  and  provide 
the  first  known  characterization  of  injectivity  in  the  complex  case  (Theorem  2.3). 
Next,  we  make  a  rather  surprising  identification:  that  intensity  measurements  are 
injective  in  the  complex  case  precisely  when  the  corresponding  phase-only  measure¬ 
ments  are  injective  in  some  sense  (Theorem  2.4).  We  then  use  this  identification  to 
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prove  the  necessity  of  the  complement  property  for  injectivity  (Theorem  2.6).  Later, 
we  formulate  a  conjecture  that  AM  —  4  intensity  measurements  are  necessary  and 
sufficient  for  injectivity  in  the  complex  case,  as  well  as  discuss  the  cases  for  which 
the  conjecture  is  known  to  hold;  we  also  prove  several  such  cases.  Specifically,  for 
the  proof  of  the  case  M  —  3  we  introduce  a  new  test  for  injectivity,  which  we  then 
use  to  verify  the  injectivity  of  a  certain  quantum  mechanics-inspired  measurement 
ensemble,  thereby  suggesting  a  new  refinement  of  Wright’s  conjecture  from  [87]  (see 
Conjecture  2.14).  The  chapter  concludes  with  an  explicit  construction  of  AM  —  A 
intensity  measurements  which  yield  injectivity,  the  second  known  injective  ensem¬ 
ble  of  this  size  (the  first  is  due  to  Bodmann  and  Hammen  [17]).  Bodmann  and 
Hammen  [17]  leverage  the  Dirichlet  kernel  and  the  Cayley  map  to  prove  injectivity 
of  their  ensemble,  but  it  is  unclear  whether  phase  retrieval  is  algorithmically  feasi¬ 
ble  from  their  ensemble.  By  contrast,  for  the  ensemble  in  this  thesis,  we  use  basic 
ideas  from  harmonic  analysis  over  cyclic  groups  to  devise  a  corresponding  phase  re¬ 
trieval  algorithm,  and  we  demonstrate  injectivity  in  Theorem  2.20  by  proving  that 
the  algorithm  recovers  any  noiseless  signal  up  to  global  phase. 

In  Chapter  III,  we  devise  a  theory  of  ensembles  for  which  the  corresponding 
intensity  measurements  are  “almost”  injective,  that  is,  are  injective  on  a  set  of  signals 
that  is  dense  in  CM.  Here,  we  focus  on  the  real  case,  meaning  phase  retrieval  is  up 
to  a  global  sign  factor  oj  —  ±1,  and  our  approach  is  inspired  by  the  characterization 
of  injectivity  in  the  real  case  by  Balan,  Casazza  and  Edidin  [8].  After  characterizing 
almost  injectivity  in  the  real  case,  we  End  a  particularly  satisfying  sufficient  condition 
for  almost  injectivity:  that  the  ensemble  of  measurement  vectors  forms  a  unit  norm 
tight  frame  with  relatively  prime  dimensions  (Theorem  3.7).  Characterizing  almost 
injectivity  in  the  complex  case  remains  an  open  problem.  The  chapter  concludes 
with  a  discussion  of  the  computational  limits  of  phase  retrieval,  in  which  we  consider 
algorithmic  phase  retrieval  in  the  real  case  using  M  +  1  almost  injective  intensity 
measurements.  Specifically,  we  show  that  phase  retrieval  in  this  case  is  NP-hard 
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by  reduction  from  the  subset  sum  problem  (Theorem  3.9).  The  hardness  of  phase 
retrieval  in  this  minimal  case  suggests  a  new  problem  for  phase  retrieval:  What  is 
the  smallest  C  for  which  there  exists  a  family  of  ensembles  of  size  CM  +  o{M )  such 
that  phase  retrieval  can  be  performed  in  polynomial  time? 

We  devote  Chapter  IV  to  stability  in  phase  retrieval.  Here,  we  start  by  focusing 
on  the  real  case,  for  which  we  give  upper  and  lower  Lipschitz  bounds  of  the  intensity 
measurement  mapping  in  terms  of  singular  values  of  submatrices  of  the  measure¬ 
ment  ensemble  (Lemma  4.3  and  Theorem  4.5);  this  suggests  a  new  matrix  condition 
called  the  strong  complement  property,  which  strengthens  the  complement  property 
of  Balan  et  al.  [8]  and  bears  some  resemblance  to  the  restricted  isometry  property  of 
compressed  sensing  [23] .  As  we  will  discuss,  our  result  corroborates  the  intuition  that 
localized  frames  fail  to  yield  stability.  We  then  show  that  Gaussian  random  measure¬ 
ments  satisfy  the  strong  complement  property  with  high  probability  (Theorem  4.7), 
which  nicely  complements  certain  results  of  Eldar  and  Mendelson  [49] .  In  particular, 
we  find  an  explicit,  intuitive  relation  between  the  Lipschitz  bounds  and  the  number 
of  intensity  measurements  per  dimension  (see  Figure  3).  Finally,  we  present  results 
in  both  the  real  and  complex  cases  using  a  stochastic  noise  model,  much  like  Balan 
did  for  the  real  case  in  [5];  here,  we  leverage  Cramer- Rao  lower  bounds  to  identify 
stability  with  stronger  versions  of  the  injectivity  characterizations  (see  Theorems  4.8 
and  4.10). 

Chapter  V  finally  returns  to  the  phase  error  problem  in  synthetic  aperture 
radar.  By  incorporating  techniques  from  bistatic  radar,  we  formulate  the  phase  error 
problem  in  terms  of  relative  phases,  bearing  some  resemblance  to  those  obtained 
from  interferometric  intensity  measurements  used  for  phase  retrieval  by  Alexeev, 
Bandeira,  Fickus  and  Mixon  [3].  In  particular,  Alexeev  et  al.  leverage  an  algorithm 
known  as  angular  synchronization  [84]  to  recover  a  set  of  phases  from  their  relative 
phase  measurements,  which  motivates  a  graph  theoretic  approach  to  the  phase  error 
problem  in  SAR.  LIsing  this  approach,  we  then  formulate  phase  error  recovery  as  a 
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feasibility  problem,  solutions  to  which  are  only  unique  up  to  a  modulation  and  global 
phase.  We  conclude  by  constructing  an  algorithm  that  extracts  phase  errors  from 
multistatic  SAR  data  using  certain  graphs  that  can  be  obtained  from  particular 
arrangements  of  different  numbers  of  aircraft.  Our  two-step  image  reconstruction 
algorithm  first  uses  an  iterative  form  of  angular  synchronization  to  determine  the 
phase  errors  up  to  a  modulation  and  a  single  global  phase  factor,  and  then  maximizes 
the  image’s  total  variation  to  determine  the  appropriate  modulation  and  phase  factor. 
Simulations  with  random  phase  error  data  are  provided,  with  which  it  is  shown  that 
the  algorithm  exhibits  stability  in  terms  of  the  number  of  cycles  contained  in  the 
parent  graph.  In  particular,  the  number  of  cycles  is  directly  related  to  the  number  of 
aircraft  used  in  the  multistatic  system,  and  the  simulations  suggest  that  the  phase 
error  problem  can  be  solved  using  only  a  few  aircraft  (e.g.,  as  few  as  five  for  a  graph 
of  101  vertices). 

We  conclude  in  Chapter  VI  with  some  discussion  and  ideas  for  future  work. 
For  the  record,  the  material  presented  in  Chapters  II,  III,  and  IV  also  appears  in 
three  peer-reviewed  publications.  Sections  2.1  and  2.2,  as  well  as  Chapter  IV  and 
Appendix  A  have  appeared  in  the  proceedings  of  the  10th  International  Conference 
on  Sampling  Theory  and  Applications  [10],  and  a  journal  version  of  the  conference 
paper  has  been  accepted  for  publication  in  Applied  and  Computational  Harmonic 
Analysis  [11].  Also,  Section  2.3  and  Chapter  III  appear  in  a  journal  article  which 
has  been  accepted  for  publication  in  Linear  Algebra  and  its  Applications  [52], 
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II.  Injective  intensity  measurements  and 
the  4 M  —  4  conjecture 

An  underlying  theme  in  the  phase  retrieval  problem  is  determining  necessary  and 
sufficient  conditions  for  the  intensity  measurement  process  A  to  be  injective.  Indeed, 
injectivity  ensures  complete  information  for  signal  reconstruction,  and  so  these  con¬ 
ditions  are  important  specifications  for  the  intensity  measurement  process.  Recall 
that,  given  a  collection  of  measurement  vectors  <f>  =  {yn}^=i  in  V  =  MM  or  CM,  the 
intensity  measurement  process  A  cannot  be  injective  if  viewed  as  a  mapping  from  V 
into  M.N .  For  this  reason,  we  identify  vectors  x,  y  G  V  for  which  there  exists  a  scalar 
c  G  S  such  that  y  =  cx,  and  we  view  the  intensity  measurement  process  as  a  map¬ 
ping  A:  V/S  — >  ,  where  S  =  {±1}  or  T.  In  this  chapter,  we  will  examine  what 

it  means  for  an  ensemble  of  intensity  measurements  to  be  injective.  We  characterize 
injectivity  in  both  the  real  and  complex  cases  before  focusing  on  injectivity  with  the 
absolute  minimum  number  of  intensity  measurements.  This  leads  to  the  conjecture 
that  AM  —  4  intensity  measurements  are  necessary  and  sufficient  for  injectivity  in 
the  complex  case  (Conjecture  2.9).  The  remainder  of  the  chapter  is  dedicated  to 
making  progress  on  this  conjecture,  including  a  deterministic  construction  of  AM  —  4 
intensity  measurements  that  yield  injectivity. 

2.1  Injectivity  and  the  complement  property 

Phase  retrieval  is  impossible  without  injective  intensity  measurements.  As 
such,  we  desire  necessary  and  sufficient  conditions  on  the  size  of  an  ensemble  of  M- 
dimensional  measurement  vectors  <f>  =  {<Pn}n=i  SU(4i  that  the  intensity  measurements 
{|(x,  ipn )  1 2 } n=  i  enable  successful  recovery  of  the  signal  x  (up  to  a  global  phase  factor). 
In  their  seminal  work  on  phase  retrieval  [8],  Balan,  Casazza  and  Edidin  introduce 
the  following  property  to  analyze  injectivity: 
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Definition  2.1.  An  ensemble  $  =  {<pn}n= i  (CM  )  satisfies  the  complement 

property  (CP)  if  for  every  S  C  {1,. . .  ,N},  either  {p n}nes  or  {Pn}n&sc  spans  MM 

(CM). 

Here  and  throughout,  Sc  denotes  the  set  {1, . . . ,  N}  \  S.  In  the  real  case,  the 
complement  property  is  characteristic  of  injectivity,  as  demonstrated  in  [8].  The 
proof  of  this  result  is  provided  below;  it  contains  several  key  insights  which  will  be 
applied  later. 

Theorem  2.2.  Consider  $  =  {pn}((=1  C  RM  and  the  mapping  A:  MM/{±1}  — >  RN 
defined  by  (^4(x))(n)  :=  \(x,pn)\2.  Then  A  is  injective  if  and  only  if  <3>  satisfies  the 
complement  property. 

Proof.  We  will  prove  both  directions  by  obtaining  the  contrapositives. 

(=>)  Assume  that  <f>  is  not  CP.  Then  there  exists  S  C  {1, . . . ,  N}  such  that 
neither  {pn}n<=s  nor  {</?n}neS°  spans  Mm.  This  implies  that  there  are  nonzero  vectors 
u,v  G  MM  such  that  (u,pn)  =  0  for  all  n  G  S  and  (v,(pn)  =  0  for  all  n  G  Sc.  For 
each  n,  we  then  have 

\{u±V,(pn)\2  =  \{u,(pn)\2  ±  2R e{u,ipn){v,(pn)  +  \{v,(fn)\2  =  \{u,(pn)\2  +  \{v,(pn)\2. 

Since  | (u  +  v,(pn) |2  =  \(u  —  v,tpn)  |2  for  every  n,  we  have  A{u  +  v)  —  A(u  —  v). 
Moreover,  u  and  v  are  nonzero  by  assumption,  and  so  u  +  v  ±(u  —  v) . 

(<^=)  Assume  that  A  is  not  injective.  Then  there  exist  vectors  x,  y  G  MM  such 
that  x  ±y  and  A(x)  =  A(y).  Taking  S  :=  {n  :  (x,tpn)  =  —(y,tpn)},  we  have 
(x  +  y,  Lpn)  =  0  for  every  n  G  S.  Otherwise  when  n  G  Sc,  we  have  (x,  ipn)  =  (y,  (pn) 
and  so  (x  —  y,  ipn)  =  0.  Furthermore,  both  x  +  y  and  x  —  y  are  nontrivial  since 
x  ±y,  and  so  neither  {<^n}nes  nor  {Pn\n^sc  spans  MM.  □ 

Note  that  in  [8]  it  is  erroneously  stated  that  the  first  part  of  the  above  proof 
also  gives  necessity  of  CP  for  injectivity  in  the  complex  case.  Indeed,  the  proof 
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demonstrates  that  u  +  v  ^  ±(u  —  v),  but  fails  to  establish  that  u+v  ^  (u  —  v)  mod  T; 
for  instance,  it  could  very  well  be  the  case  that  u  +  v  =  i(u  —  v),  and  so  injectivity 
would  not  be  violated  in  the  complex  case.  A  correct  proof  of  the  result  in  question 
is  provided  later  (Theorem  2.6).  In  the  meantime,  we  characterize  injectivity  in  the 
complex  case: 

Theorem  2.3.  Consider  <h  =  {< pn}n=i  ^  and  the  mapping  A:  CM/T  — >  RN 
defined  by  (*4.(a;))(n)  :=  |(a;,  </?n)|2.  Viewing  {<pn(p^u}^=1  as  vectors  in  M2Af,  denote 
S(u )  :=  spa n.R{tpnVnu}n=i-  Then  the  following  are  equivalent: 

(a)  A  is  injective. 

(b)  dim  S(u)  >  2 M  —  1  for  every  u  G  CM  \  {0}. 

(c)  S(u )  =  spanjgjiu}-1-  for  every  u  G  CM  \  {0}. 

Before  proving  this  theorem,  note  that  unlike  the  characterization  in  the  real 
case,  it  is  not  clear  whether  this  characterization  can  be  tested  in  finite  time;  instead 
of  being  a  statement  about  all  (finitely  many)  partitions  of  {1, . . . ,  iV},  this  is  a 
statement  about  all  u  G  CM  \  {0}.  However,  we  can  view  this  characterization  as  an 
analog  to  the  real  case  in  some  sense:  In  the  real  case,  the  complement  property  is 
equivalent  to  having  spa n{<y9n<y9*u}))!=1  =  Mm  for  all  u  G  MM  \  {0}.  As  the  following 
proof  makes  precise,  the  fact  that  fails  to  span  all  of  M2M  is  rooted  in 

the  fact  that  more  information  is  lost  with  phase  in  the  complex  case. 

Proof  of  Theorem  2.3.  (a)  (c):  Suppose  A  is  injective.  We  need  to  show  that 

{ipnip^u}^^  spans  the  set  of  vectors  orthogonal  to  i u.  Here,  orthogonality  is  with 
respect  to  the  real  inner  product,  which  can  be  expressed  as  (a,  6)®  =  Re(a,  b ).  Note 
that 

\(u±v,ipn)\2  =  \  (u,<pn)\2  ±2Re(u,(pn)(ipn,v)  +  |(v,^n)|2, 
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and  so  subtraction  gives 


I  (u  +  v,<pn)  |2  -  \(u  -  V,tpn)\2  =  4R  e{u,  (pn)  {(pn,  v)  =  4  (<pny*nu,v)R.  (1) 


In  particular,  if  the  right-hand  side  of  (1)  is  zero,  then  injectivity  implies  that  there 
exists  some  u>  of  unit-modulus  such  that  u  +  v  =  ui(u  —  v).  Since  «  ^  0,  we  know 
ou  7^—1,  and  so  rearranging  gives 


v  = 


1  —  C 0 
1  +  C 0 


u  = 


(1  -  ta)(l  +  uj) 

[TT^2 


U  = 


2  Im  uj 
ll+cnl2 


m. 


This  means  S^u)1-  C  spanK{m}.  To  prove  spanR{m}  C  S^u)2-,  take  v  =  aiu  for 
some  a  G  M  and  dehne  t o  :=  which  necessarily  has  unit-modulus.  Then 


u  +  v  —  u  +  am  =  (1  +  ai)u 


1  +  a\ 

1  —  ai 


( u  —  aiu) 


=  u(u  —  v). 


Thus,  the  left-hand  side  of  (1)  is  zero,  meaning  v  e  S^u)2-. 

(b)  -v=>  (c):  First,  (b)  immediately  follows  from  (c)  since  dim(spanR{m})  =  1 
for  all  u  G  CA/  \  {0}.  For  the  other  direction,  note  that  iu  is  necessarily  orthogonal 
to  every  ipn<p*nu: 


{(pnip*nu,iu)R  =  R e((pnip*nu,iu)  =  R e(u,(pn)(ipn,iu)  =  -  Rei|(-u,  (pn)\2  =  0. 


Thus,  spanR{iu}  C  S'(m)±  for  all  nonzero  u.  Since,  by  (b),  dimS'(n)J“  <  1,  this  then 
gives  (c). 

(c)  (a):  This  portion  of  the  proof  is  inspired  by  Mukherjee’s  analysis  in  [80]. 

Suppose  A(x)  =  A(y).  If  x  —  y,  we  are  done.  Otherwise,  x  —  y  ^  0,  and  so  we  may 
apply  (c)  to  u  =  x  —  y.  First,  note  that 


(VnVnix  -y),x  +  y) R  =  R e{<Pn<Pn(x  -  y),  x  +  y)  =  Re(x  +  y)*ipn(p*n(x  -  y), 
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and  so  expanding  gives 


(Vn<P*n{x  -y),x  +  y) R  =  Re  (\ip*nx\2  -  x*(pn<p*ny  +  y*VnV*nx  -  \(p*ny\2') 

=  Re  (  -  X*VnV*ny  +  X*<Pn<Pny)  =  0- 

Since  x  +  y  G  S(x  —  y)±  =  spanF{i(x  —  y)},  there  exists  a  G  M  such  that  x  +  y  = 
od(x  —  y),  and  so  rearranging  gives  y  =  jjfiffx,  meaning  y  =  x  mod  T.  □ 

Theorem  2.3  leaves  a  lot  to  be  desired;  it  is  still  unclear  what  it  takes  for  a 
complex  ensemble  to  yield  injective  intensity  measurements.  While  in  pursuit  of  a 
more  clear  understanding,  we  established  the  following  bizarre  characterization:  A 
complex  ensemble  yields  injective  intensity  measurements  precisely  when  it  yields 
injective  phase-only  measurements  (in  some  sense).  This  is  made  more  precise  in 
the  following  theorem  statement: 

Theorem  2.4.  Consider  <f>  =  {(pn}n= 1  ^  C M  and  the  mapping  A:  CM/T  — y  RN 
defined  by  (v4(x))(n)  :=  |(a;,  y>n)|2.  Then  A  is  injective  if  and  only  if  the  following 
statement  holds:  If  for  every  n  =  1, N ,  either  arg( (x,  p>n)2)  =  arg((y,  p>n)2)  or 
one  of  the  sides  is  not  well-defined,  then  x  =  0,  y  =  0,  or  y  =  x  mod  M  \  {0}. 

Proof.  By  Theorem  2.3,  A  is  injective  if  and  only  if 

\/x  G  CM  \  {0},  spanE { !pn'p*nx } „.= \  =  spanM{ix}±.  (2) 

Taking  orthogonal  complements  of  both  sides,  note  that  regardless  of  x  G  CM  \  {0}, 
we  know  spanK{ia:}  is  necessarily  a  subset  of  (span^^^a;}^)-1-,  and  so  (2)  is 
equivalent  to 


VigCm\ {0}, 


R e(ipnip*nx,  iy)  —  0  Vn  =  1, . . . ,  N 
= =>-  y  =  0  or  y  =  x  mod  M  \  {0}. 
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Thus,  we  need  to  determine  when  Im {x ,  (pn)  (y ,  (pn)  =  Re{(pn(p^x,iy)  =  0.  We  claim 
that  this  is  true  if  and  only  if  arg((x,  </?n)2)  =  arg ((y,^n)2)  or  one  of  the  sides  is 
not  well-defined.  To  see  this,  we  substitute  a  :=  (x,(pn)  and  b  :=  (y,ipn).  Then  to 
complete  the  proof,  it  suffices  to  show  that  Im  ab  =  0  if  and  only  if  arg(a2)  =  arg(52), 
a  =  0,  or  b  =  0. 

(<=)  If  either  a  or  b  is  zero,  the  result  is  immediate.  Otherwise,  if 
2arg(a)  =  arg(a2)  =  arg(62)  =  2arg(6), 

then  27 r  divides  2(arg(a)  —  arg(6)),  and  so  arg  (ah)  =  arg(a)  —  arg  (b)  is  a  multiple  of 
7T.  This  implies  that  ab  €  M,  and  so  Im  ab  =  0. 

(=^)  Suppose  Im  ab  =  0.  Taking  the  polar  decompositions  a  =  re'9  and  b  =  se'^, 
we  equivalently  have  that  rs  sin  (6  —  (j>)  =  0.  Certainly,  this  can  occur  whenever 
r  or  s  is  zero,  i.e.,  a  =  0  or  b  =  0.  Otherwise,  a  difference  formula  then  gives 
sin  9  cos  0  =  cos  9  sin  <j>.  From  this,  we  know  that  if  6  is  an  integer  multiple  of  tt/2, 
then  cj)  is  as  well,  and  vice  versa,  in  which  case 

arg(a2)  =  2arg(a)  =  7r  =  2  arg  (6)  =  arg(62). 

Else,  we  can  divide  both  sides  by  cos  9  cos  cj)  to  obtain  tan  6  =  tan  0,  from  which  it 
is  evident  that  9  =  <p  mod  7 r,  and  so  arg(a2)  =  2arg(a)  =  2  arg (6)  =  arg(62).  □ 

This  notion  of  injective  phase-only  measurements  is  similar  to  the  idea  of  par¬ 
allel  rigidity  in  certain  location  estimation  problems  (for  example,  see  [12]  and  ref¬ 
erences  therein).  It  would  be  interesting  to  further  investigate  this  relationship, 
although  we  will  not  do  so  here;  at  the  very  least,  it  is  rather  striking  that  injectivity 
is  equivalent  in  both  settings.  We  will  actually  use  this  result  to  (correctly)  prove 
the  necessity  of  CP  for  injectivity.  First,  we  need  the  following  lemma,  which  is 
interesting  in  its  own  right: 
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Lemma  2.5.  Consider  $  =  {‘■pn}n=i  —  C M  and  the  mapping  A:  CM /T  — >  RN  de¬ 
fined  by  („4.(a;))(n)  :=  |(x,  (pn)\2  ■  If  A  is  injective,  then  the  mapping  B:  CM/{±1}  — > 
defined  by  ( B(x))(n )  :=  (x,tpn)2  is  also  injective. 

Proof.  Suppose  A  is  injective.  Then  we  have  the  following  facts  (one  by  definition, 
and  the  other  by  Theorem  2.4): 

(i)  If  |  ( x ,  (pn)  | 2  =  |  (y,  (pn)  | 2  for  all  n  —  1 ,...  ,N,  then  y  =  x  mod  T. 

(ii)  If,  for  every  n  G  {1, . . . ,  N},  either  arg((a;,  p>n)2)  =  arg((?/,  Pn)2)  or  one  of  the 
sides  is  not  well-defined,  then  x  =  0,  y  =  0,  or  y  =  x  mod  M  \  {0}. 

Now  suppose  we  have  (x,tpn)2  =  (y,ipn)2  for  all  n  =  1 ,...  ,N.  Then  their  moduli 
and  arguments  are  also  equal,  and  so  (i)  and  (ii)  both  apply.  Of  course,  y  =  x  mod  T 
implies  x  —  0  if  and  only  if  y  =  0.  Otherwise  both  are  nonzero,  in  which  case  there 
exists  u  6  TnK  \  {0}  =  {±1}  such  that  y  =  c ox.  In  either  case,  y  =  x  mod  {±1}, 
so  B  is  injective.  □ 

Leveraging  the  injectivity  of  B  modulo  {±1},  we  may  now  extend  the  necessity 
of  CP  for  injectivity  to  complex  ensembles: 

Theorem  2.6.  Consider  =  {(pn}n=i  —  C M  and  the  mapping  A:  CM/T  — y  RN 
defined  by  (A(x))(n)  :=  \{x,ipn)\2.  If  A  is  injective,  then  satisfies  the  complement 
property. 

Proof.  Recall  that  if  A  is  injective,  then  so  is  the  mapping  B  of  Lemma  2.5.  There¬ 
fore,  it  suffices  to  show  that  $  is  CP  if  B  is  injective.  To  complete  the  proof,  we  will 
obtain  the  contrapositive  (note  the  similarity  to  the  proof  of  Theorem  2.2).  Sup¬ 
pose  <f>  is  not  CP.  Then  there  exists  S  C  {1, ... ,  iV}  such  that  neither  {<^n}neS  nor 
Wn}  n£Sc  spans  CAI .  This  implies  that  there  are  nonzero  vectors  u,  v  G  CM  such  that 
(■ u ,  ipn)  =  0  for  all  n  e  S  and  (v,  (pn)  =  0  for  all  n  G  Sc.  For  each  n,  we  then  have 

(u  ±  V,ipn)2  =  (ll,  Pn}2  ±  2 {u,(pn){v,(pn)  +  {v,(pn)2  =  (u,  tfin)2  +  (v,(pn)2. 
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Since  (u  +  v,  (pn)2  =  (u  —  v,  ipn)2  for  every  n,  we  have  B(u  +  v )  =  B(u  —  v).  Moreover, 
u  and  v  are  nonzero  by  assumption,  and  so  u  +  v  ^  ±(m  —  v).  □ 

Note  that  the  complement  property  is  necessary  but  not  sufficient  for  injectiv¬ 
ity.  To  see  this,  consider  the  measurement  vectors  (1, 0),  (0, 1)  and  (1, 1)  in  C2.  These 
certainly  satisfy  the  complement  property,  but  *4((1,  i))  =  (1, 1,  2)  =  M((l,  — i)),  de¬ 
spite  the  fact  that  (1,  i)  ^  (1,  — i)  mod  T;  in  general,  real  measurement  vectors  fail 
to  yield  injective  intensity  measurements  in  the  complex  setting  since  they  do  not 
distinguish  complex  conjugates.  Indeed,  we  have  yet  to  find  a  “good”  sufficient  con¬ 
dition  for  injectivity  in  the  complex  case.  As  an  analogy  for  what  we  really  want, 
consider  the  notion  of  full  spark :  An  ensemble  {9?n}n=i  —  is  said  to  be  full 
spark  if  every  subcollection  of  M  vectors  spans  Mm.  It  is  easy  to  see  that  full  spark 
ensembles  with  N  >  2 M  —  1  necessarily  satisfy  the  complement  property  (thereby 
implying  injectivity  in  the  real  case),  since  in  this  case 

min  (  maxi  I S I ,  I  Sc  I }  }  —  M, 

and  so  it  is  guaranteed  that  one  of  the  sets  {pn}n&s  or  {</?n}ne5c  spans.  Furthermore, 
the  notion  of  full  spark  is  simple  enough  to  admit  deterministic  constructions  [4,81]. 
Deterministic  measurement  ensembles  are  particularly  desirable  for  the  complex  case, 
and  so  finding  a  good  sufficient  condition  for  injectivity  is  an  important  problem  that 
remains  open. 

2.2  Towards  a  rank-nullity  theorem  for  phase  retrieval 

If  one  thinks  of  a  matrix  as  being  built  one  column  at  a  time,  then  the 
rank-nullity  theorem  states  that  each  column  contributes  to  either  the  column  space 
or  the  null  space.  If  the  columns  are  then  used  as  linear  measurement  vectors  (say 
we  take  measurements  y  =  $*2:  of  a  vector  x ),  then  the  column  space  of  gives 
the  subspace  that  is  actually  sampled,  and  the  null  space  captures  the  algebraic 
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nature  of  the  measurements’  redundancy.  Therefore,  an  efficient  sampling  of  an 
entire  vector  space  would  apply  a  matrix  <f>  with  a  small  null  space  and  large  column 
space  (e.g.,  an  invertible  square  matrix).  How  do  we  ford  such  a  sampling  with 
intensity  measurements?  The  following  makes  this  question  more  precise: 


Problem  2.7.  For  any  dimension  M ,  what  is  the  smallest  number  N*(M )  of  injec¬ 
tive  intensity  measurements,  and  how  do  we  design  such  measurement  vectors? 

To  be  clear,  this  problem  was  completely  solved  in  the  real  case  by  Balan, 
Casazza  and  Edidin  [8].  Indeed,  Theorem  2.2  immediately  implies  that  2 M  —  2 
intensity  measurements  are  necessarily  not  injective,  and  furthermore  that  2 M  —  1 
measurements  are  injective  if  and  only  if  the  measurement  vectors  are  full  spark.  As 
such,  we  will  focus  our  attention  to  the  complex  case. 

In  the  complex  case,  Problem  2.7  has  some  history  in  the  quantum  mechan¬ 
ics  literature.  For  example,  [87]  presents  Wright ’s  conjecture  that  three  observables 
suffice  to  uniquely  determine  any  pure  state.  In  phase  retrieval  parlance,  the  conjec¬ 
ture  states  that  there  exist  unitary  matrices  U\,  U2  and  U3  such  that  <h  =  [U\  t/2  Uf 
yields  injective  intensity  measurements.  Note  that  Wright’s  conjecture  actually  im¬ 
plies  that  N*(M)  <  3M  — 2;  indeed,  Ui  determines  the  norm  (squared)  of  the  signal, 
rendering  the  last  column  of  both  U2  and  f/3  unnecessary.  Finkelstein  [56]  later 
proved  that  N*(M )  >  3 M  —  2;  combined  with  Wright’s  conjecture,  this  led  many 
to  believe  that  N*(M )  =  3 M  —  2  (for  example,  see  [24]).  However,  both  this  and 
Wright’s  conjecture  were  recently  disproved  in  [62],  in  which  Heinosaari,  Mazzarella 
and  Wolf  invoked  embedding  theorems  from  differential  geometry  to  prove  that 


N*(M)  >  < 


AM  -  2 a(M  -  1)  -  3 
AM  -  2 a(M  -  1)  -  2 
AM  -  2 a(M  -  1)  -  1 


for  all  M 

if  M  is  odd  and  a(M  —  1) 
if  M  is  odd  and  a ( M  —  1) 


2  mod  4 

3  mod  4, 


(3) 
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where  a(M  —  l)  <  log 2(M)  is  the  number  of  l’s  in  the  binary  representation  of  M— 1. 
By  comparison,  Balan,  Casazza  and  Ecliclin  [8]  proved  that  N*(M )  <  4 M  —  2,  and 
so  we  at  least  have  the  asymptotic  expression  N*(M )  =  (4  +  o(l))M. 

At  this  point,  we  should  clarify  some  intuition  for  N*(M)  by  explaining  the 
nature  of  these  best  known  lower  and  upper  bounds.  First,  the  lower  bound  (3) 
follows  from  an  older  result  that  complex  projective  space  CP"  does  not  smoothly 
embed  into  M4?l~2a(n)  (and  other  slight  refinements  which  depend  on  n) ;  this  is  due  to 
Mayer  [75],  but  we  highly  recommend  James’s  survey  on  the  topic  [66].  To  prove  (3) 
from  this,  suppose  A:  CM/T  — y  were  injective.  Then  £  defined  by  £(x)  := 

M(x)/||a;||2  embeds  CP4/_1  into  ,  and  as  Heinosaari  et  al.  show,  the  embedding  is 
necessarily  smooth;  considering  A(x)  is  made  up  of  rather  simple  polynomials,  the 
fact  that  £  is  smooth  should  not  come  as  a  surprise.  As  such,  the  nonembedding 
result  produces  the  best  known  lower  bound.  To  evaluate  this  bound,  first  note 
that  Milgram  [77]  constructs  an  embedding  of  CP"  into  l4n_a(n)+1i  establishing  the 
importance  of  the  a(n)  term,  but  the  constructed  embedding  does  not  correspond 
to  an  intensity  measurement  process.  In  order  to  relate  these  embedding  results  to 
our  problem,  consider  the  real  case:  It  is  known  that  for  odd  n  >  7,  real  projective 
space  MP"  smoothly  embeds  into  ]^2n-“(n)+1  [86],  which  means  the  analogous  lower 
bound  for  the  real  case  would  necessarily  be  smaller  than 

2 (M  -  1)  -  a(M  -  1)  +  1  =  2 M  -  a(M  -  1)  -  1  <  2M  -  1. 

This  indicates  that  the  a ( M  —  1)  term  in  (3)  might  be  an  artifact  of  the  proof 
technique,  rather  than  of  N*(M). 

There  is  also  some  intuition  to  be  gained  from  the  upper  bound  N*(M )  < 
AM  —  2,  which  Balan  et  al.  [8]  proved  by  applying  certain  techniques  from  algebraic 
geometry  (some  of  which  will  be  applied  later  in  this  section).  In  fact,  their  result 
actually  gives  that  AM  —  2  or  more  measurement  vectors,  if  chosen  generically,  will 
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yield  injective  intensity  measurements;  here,  generic  is  a  technical  term  involving 
the  Zariski  topology,  but  it  can  be  thought  of  as  some  undisclosed  property  which 
is  satisfied  with  probability  1  by  measurement  vectors  drawn  from  continuous  dis¬ 
tributions.  This  leads  us  to  think  that  N*(M)  generic  measurement  vectors  might 
also  yield  injectivity. 

The  lemma  that  follows  will  help  to  refine  our  intuition  for  N*(M),  and  it  will 
also  play  a  key  role  in  the  main  theorems  of  this  section  (a  similar  result  appears 
in  [62]).  Before  stating  the  result,  define  the  real  M2-dimensional  space  ]HIMxM  of 
self-adjoint  M  x  M  matrices;  note  that  this  is  not  a  vector  space  over  the  com¬ 
plex  numbers  since  the  diagonal  of  a  self-adjoint  matrix  must  be  real.  Given  an 
ensemble  of  measurement  vectors  {<pn}n=i  —  ,  define  the  super  analysis  operator 

A:  HMxM  — »  Rn  by  (A H)(n)  =  (H,  (pnPn )hs;  here,  (•, -)hs  denotes  the  Hilbert- 
Schmidt  inner  product,  which  induces  the  Frobenius  matrix  norm.  Note  that  A  is  a 
linear  operator,  and  yet 

(A  xx*)(n)  =  (xx*,(pn(p*n)K  s  =  Tr  [ipntp*nxx*] 

=  Tr  [(p*nxx*ipn]  =  (p*nxx*(pn  =  \{x,(pn)\2  =  (A(x))(n). 

In  words,  the  class  of  vectors  identified  with  x  modulo  T  can  be  “lifted”  to  xx*, 
thereby  linearizing  the  intensity  measurement  process  at  the  price  of  squaring  the 
dimension  of  the  vector  space  of  interest;  this  identification  has  been  exploited  by 
some  of  the  most  noteworthy  strides  in  modern  phase  retrieval  [7,27].  As  the  follow¬ 
ing  lemma  shows,  this  identification  can  also  be  used  to  characterize  injectivity: 

Lemma  2.8.  A  is  not  injective  if  and  only  if  there  exists  a  matrix  of  rank  1  or  2  in 
the  null  space  of  A. 

Proof.  (=^)  If  A  is  not  injective,  then  there  exist  x,y  e  CM/T  with  x  ^  y  mod  T 
such  that  A(x)  =  A(y).  That  is,  Area;*  =  A yy* ,  and  so  xx*  —yy*  is  in  the  null  space 
of  A. 
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(<£=)  First,  suppose  there  is  a  rank-1  matrix  H  in  the  null  space  of  A.  Then 
there  exists  x  €  CA/  such  that  H  =  xx*  and 

(A(x))(n)  =  (A  xx*)  (n)  =  0  =  (A(0))(n). 

But  x  ^  0  mod  T,  and  so  A  is  not  injective.  Now  suppose  there  is  a  rank- 2  matrix 
H  in  the  null  space  of  A.  Then  by  the  spectral  theorem,  there  are  orthonormal 
iq,  it 2  £  CM  and  nonzero  Ai  >  A2  such  that  H  =  Xiu±ul  +  X2U2U2.  Since  H  is  in  the 
null  space  of  A,  the  following  holds  for  every  n : 

0  =  (H,  (pnip*n )hS  =  (A1M1M1  +  \2U2lll,  (pn(p*n)KS  =  All  {ui,(pn)  | 2  +  A2 1  (u2,  (pn)  |2-  (4) 

Taking  x  :=  |  Ai  1 1//2w1  and  y  :=  |  A2 1  1//2t/-2,  note  that  y  ^  x  mod  T  since  they  are 
nonzero  and  orthogonal.  We  claim  that  A(x)  =  A(y ),  which  would  complete  the 
proof.  If  Ai  and  A2  have  the  same  sign,  then  by  (4),  \(x,tpn)\2  +  \  {y,y>n)\2  =  0  for 
every  n,  meaning  |(a;,</9n)|J  =  0  =  \(y,tpn)\2.  Otherwise,  Ai  >  0  >  A2,  and  so 

xx*  —  yy*  =  XiU\ul  +  \2U2ii2  =  A 

is  in  the  null  space  of  A,  meaning  A{x)  =  Axx*  =  A  yy*  =  A(y).  □ 

Lemma  2.8  indicates  that  we  want  the  null  space  of  A  to  avoid  nonzero  matrices 
of  rank  <  2.  Intuitively,  this  is  easier  when  the  “dimension”  of  this  set  of  matrices 
is  small.  To  get  some  idea  of  this  dimension,  count  real  degrees  of  freedom:  By 
the  spectral  theorem,  almost  every  matrix  in  ]HIMxM  of  rank  <  2  can  be  uniquely 
expressed  as  X\Uiul  +  X2U2U2  with  Ai  <  A2.  Here,  (Ai,  A2)  has  two  degrees  of  freedom. 
Next,  U\  can  be  any  vector  in  CM,  except  its  norm  must  be  1.  Also,  since  u \  is  only 
unique  up  to  global  phase,  we  take  its  first  entry  to  be  nonnegative  without  loss  of 
generality.  Given  the  norm  and  phase  constraints,  iq  has  a  total  of  2 M  —  2  real 
degrees  of  freedom.  Finally,  u2  has  the  same  norm  and  phase  constraints,  but  it 
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must  also  be  orthogonal  to  u\,  that  is,  Re(w2,Mi)  =  Im(-u2,wi)  =  0.  As  such,  w2  has 
2 M  —  4  real  degrees  of  freedom.  All  together,  we  can  expect  the  set  of  matrices  in 
question  to  have  2  +  (2 M  —  2)  +  (2 M  —  4)  =  AM  —  4  real  dimensions. 

If  the  set  S  of  matrices  of  rank  <  2  formed  a  subspace  of  HIMxM  (it  doesn’t), 
then  we  could  expect  the  null  space  of  A  to  intersect  that  subspace  nontrivially 
whenever  dim  null(  A)  +  (4M— 4)  >  dim(HIMxM)  =  M 2 .  By  the  rank- nullity  theorem, 
this  would  indicate  that  injectivity  requires 

N  >  rank(A)  =  M 2  —  dim  null(A)  >  AM  —  4.  (5) 

Of  course,  this  logic  is  not  technically  valid  since  S  is  not  a  subspace.  It  is,  however, 
a  special  kind  of  set:  a  real  projective  variety.  To  see  this,  we  first  show  that  it 
is  a  real  algebraic  variety,  specifically,  the  set  of  members  of  IHIMxM  for  which  all 
3x3  minors  are  zero.  Of  course,  by  the  rank  constraint,  every  member  of  S  has  this 
minor  property.  Next,  we  show  that  members  of  S  are  the  only  matrices  with  this 
property:  If  the  rank  of  a  given  matrix  is  >  3,  then  it  has  an  Mx3  sub  matrix  of 
linearly  independent  columns,  and  since  the  rank  of  its  transpose  is  also  >  3,  this 
M  x  3  submatrix  must  have  3  linearly  independent  rows,  thereby  implicating  a  full- 
rank  3x3  submatrix.  This  variety  is  said  to  be  projective  because  it  is  closed  under 
scalar  multiplication.  If  S  were  a  projective  variety  over  an  algebraically  closed 
field  (it’s  not),  then  the  projective  dimension  theorem  (Theorem  7.2  of  [61])  says 
that  S  intersects  null  (A)  nontrivially  whenever  the  dimensions  are  large  enough: 
dimnull(A)  +  dim  S  >  dimIHIMxM,  thereby  implying  that  injectivity  requires  (5). 
Unfortunately,  this  theorem  is  not  valid  when  the  field  is  M;  for  example,  the  cone 
defined  by  x2  +  y2  —  z2  =  0  in  M3  is  a  projective  variety  of  dimension  2,  but  its 
intersection  with  the  2-dimensional  a^-plane  is  trivial,  despite  the  fact  that  2  +  2  >  3. 

In  the  absence  of  a  proof,  we  pose  the  natural  conjecture: 
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Conjecture  2.9  (The  AM  —  A  Conjecture).  Consider  $  =  {<pn}n=i  Q  CM  and  the 
mapping  A:  CM/T  — >  RN  defined  by  (^4(x))(n)  :=  |(x,  <pn)\2 ■  If  M  >  2,  then  the 
following  statements  hold: 

(a)  If  N  <  AM  —  A,  then  A  is  not  injective. 

(b)  If  N  >  AM  —  A,  then  A  is  injective  for  generic  <F. 

For  the  sake  of  clarity,  we  now  explicitly  state  what  is  meant  by  the  word 
“generic.”  As  indicated  above,  a  real  algebraic  variety  is  the  set  of  common  zeros  of 
a  finite  set  of  polynomials  with  real  coefficients.  Taking  all  such  varieties  in  Mn  to 
be  closed  sets  defines  the  Zariski  topology  on  Mn.  Viewing  <f>  as  a  member  of  M2MAr, 
we  then  say  a  generic  <f>  is  any  member  of  some  undisclosed  nonempty  Zariski-open 
subset  of  R2MN .  Considering  Zariski-open  sets  are  either  empty  or  dense  with  full 
measure,  genericity  is  a  particularly  strong  property.  As  such,  another  way  to  state 
part  (b)  of  the  AM  —  A  conjecture  is  “If  N  >  AM  —  4,  then  there  exists  a  real 
algebraic  variety  V  C  M2MAr  such  that  A  is  injective  for  every  <f>  ^  V.”  Note  that 
the  work  of  Balan,  Casazza  and  Edidin  [8]  already  proves  this  for  N  >  AM  —  2,  and 
in  the  time  since  we  initially  posed  this  conjecture  [10,11],  Conca,  Edidin,  Hering 
and  Vinzant  [38]  proved  it  for  the  case  M  =  2m  +  1,  where  m  is  any  positive  integer. 
Furthermore,  Conca  et  al.  successfully  established  part  (b)  by  using  techniques  from 
algebraic  geometry  to  show  that  the  set  of  non-injective  ensembles  is  a  subset  of  a 
proper  real  algebraic  variety  and,  hence,  a  Zariski  closed  set  [38]. 

At  this  point,  it  is  fitting  to  mention  that  after  initially  formulating  this  conjec¬ 
ture,  Bodmann  presented  a  Vandermonde  construction  of  AM  —  A  injective  intensity 
measurements  at  a  phase  retrieval  workshop  at  the  Erwin  Schrodinger  International 
Institute  for  Mathematical  Physics.  The  result  has  since  been  documented  in  [17], 
and  it  establishes  one  consequence  of  the  AM  —  A  conjecture:  N*(M )  <  AM  —  A. 

As  incremental  progress  toward  solving  the  AM  —  A  conjecture,  we  have  the 
following  result: 
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Theorem  2.10.  The  4 M  —  4  Conjecture  is  true  when  M  =  2. 

Proof,  (a)  Since  A  is  a  linear  map  from  4-dimensional  real  space  to  TV-dimensional 
real  space,  the  null  space  of  A  is  necessarily  nontrivial  by  the  rank-nullity  theorem. 
Furthermore,  every  nonzero  member  of  this  null  space  has  rank  1  or  2,  and  so 
Lemma  2.8  gives  that  A  is  not  injective. 

(b)  Consider  the  following  matrix  formed  by  16  real  variables: 

x1  +  ix2  x5  +  ix6  x9  +  ixw  x13  +  ixl4 
$(x)  =  .  (6) 

x3  +  \X4  x7  +  kr8  xn  +  he  12  x15  +  ix16 

If  we  denote  the  nth  column  of  <h(a;)  by  (pn(x),  then  we  have  that  A  is  injective 
precisely  when  x  G  M16  produces  a  basis  {(pn(x)(pn(x)*}^=1  for  the  space  of  2  x  2 
self-adjoint  operators.  Indeed,  in  this  case  zz*  is  uniquely  determined  by  A zz*  = 
{(zz* ,  Lpn{x)ipn{x)*)Y{s}n=i  —  A(z),  which  in  turn  determines  z  up  to  a  global  phase 
factor.  Let  A(x)  be  the  4x4  matrix  representation  of  the  super  analysis  operator, 
whose  nth  row  gives  the  coordinates  of  ipn(x)(pn(x)*  in  terms  of  some  basis  for  HI2x2, 
say 

1  0  0  0  1  0  1  1  0  i 

01  ’  01  ’  V2  10  ’  V2  -i0 

Then  V  —  {x  :  Redet  A(x)  =  ImdetA(x)  =  0}  is  a  real  algebraic  variety  in  M16, 
and  we  see  that  A  is  injective  whenever  x  G  Vc.  Since  Vc  is  Zariski-open,  it  is  either 
empty  or  dense  with  full  measure.  In  fact,  Vc  is  not  empty,  since  we  may  take  x 
such  that 

10  11 

<h(a;)  = 

0  1  1  i 

as  indicated  in  Theorem  4.1  of  [6].  Therefore,  Vc  is  dense  with  full  measure.  □ 
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Algorithm  1  The  HMW  test  for  injectivity  when  M  —  3 
Input:  Measurement  vectors  {</?n}^=i  Q  C3 
Output:  Whether  A  is  injective 

Define  A:  H3x3  — >  WN  such  that  AH  =  {(H,ipnif^) Hs}n=i 

if  dimnull(A)  =  0  then 

Output:  “INJECTIVE’'  {if  A  is  injective,  then  A  is  injective} 

else 

Pick  H  e  null(A),  H  ±  0 
if  dimnull(A)  =  1  and  det(H)  0  then 

Output:  ‘'INJECTIVE1  {if  A  only  maps  nonsingular  matrices 

to  zero,  then  A  is  injective} 

else 

Output:  “NOT  INJECTIVE”  {in  the  remaining  case,  A  maps  differences 

of  rank-1  matrices  to  zero} 

end  if 
end  if 


We  also  have  a  proof  for  the  M  =  3  case,  but  we  first  introduce  Algorithm  1, 
namely  the  HMW  test  for  injectivity;  we  name  it  after  Heinosaari,  Mazarella  and 
Wolf,  who  implicitly  introduce  this  algorithm  in  their  paper  [62], 

Theorem  2.11  (cf.  Proposition  6  in  [62]).  When  M  —  3,  the  HMW  test  correctly 
determines  whether  A  is  injective. 

Proof.  First,  if  A  is  injective,  then  A(x)  =  Axx*  =  A yy*  =  A(y)  if  and  only 
if  xx*  =  yy* ,  i.e.,  y  =  x  mod  T.  Next,  suppose  A  has  a  1-dimensional  null  space. 
Then  Lemma  2.8  gives  that  A  is  injective  if  and  only  if  the  null  space  of  A  is  spanned 
by  a  matrix  of  full  rank.  Finally,  if  the  dimension  of  the  null  space  is  2  or  more,  then 
there  exist  linearly  independent  (nonzero)  matrices  A  and  B  in  this  null  space.  If 
det  (A)  =  0,  then  it  must  have  rank  1  or  2,  and  so  Lemma  2.8  gives  that  A  is  not 
injective.  Otherwise,  consider  the  map 


/ :  1 1 — y  det  ( A  cos  t  +  B  sin  t )  Vf  e  [0, 7r  . 


Since  /( 0)  =  det  (A)  and  r)  =  det  (—A)  =  (— l)3det(A)  =  —  det  (A),  the  inter¬ 
mediate  value  theorem  gives  that  there  exists  to  £  [0, 7r]  such  that  /(to)  =  0,  i.e.,  the 


matrix  A  cos  to  +  B  sin  to  is  singular.  Moreover,  this  matrix  is  nonzero  since  A  and 
B  are  linearly  independent,  and  so  its  rank  is  either  1  or  2.  Lemma  2.8  then  gives 
that  A  is  not  injective.  □ 

As  an  example,  we  may  run  the  HMW  test  on  the  columns  of  the  following 
matrix: 


In  this  case,  the  null  space  of  A  is  1-dimensional  and  spanned  by  a  nonsingular 
matrix.  As  such,  A  is  injective.  We  will  see  that  the  HMW  test  has  a  few  important 
applications.  First,  we  use  it  to  prove  the  4 M  —  4  Conjecture  in  the  M  —  3  case: 

Theorem  2.12.  The  AM  —  4  Conjecture  is  true  when  M  =  3. 

Proof,  (a)  Suppose  N  <  AM  —  4  =  8.  Then  by  the  rank-nullity  theorem,  the  super 
analysis  operator  A:  HI3x3  — >  has  a  null  space  of  at  least  2  dimensions,  and  so 

by  the  HMW  test,  A  is  not  injective. 

(b)  Consider  a  3  x  8  matrix  of  real  variables  $(a;)  similar  to  (6).  Then  A  is 
injective  whenever  x  G  M48  produces  an  ensemble  {ipn(x)}sn=l  C  C3  that  passes  the 
HMW  test.  To  pass,  the  rank- nullity  theorem  says  that  the  null  space  of  the  super 
analysis  operator  must  be  1-dimensional  and  spanned  by  a  nonsingular  matrix.  We 
use  an  orthonormal  basis  for  HI3x3  similar  to  (7)  to  find  an  8  x  9  matrix  representation 
of  the  super  analysis  operator  A(x);  it  is  easy  to  check  that  the  entries  of  this  matrix 
(call  it  A(x))  are  polynomial  functions  of  x.  Consider  the  matrix 

T 

B{x,  y)  =  J 

A(x) 
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and  let  u(x)  denote  the  vector  of  (l,j)th  cofactors  of  B(x,y).  Then  ( y,u(x ))  = 
det (B(x,y)).  This  implies  that  u(x)  is  in  the  null  space  of  A(x),  since  each  row  of 
A(x)  is  necessarily  orthogonal  to  u(x). 

We  claim  that  u(x)  —  0  if  and  only  if  the  dimension  of  the  null  space  of  A(x) 
is  2  or  more,  that  is,  the  rows  of  A(x)  are  linearly  dependent.  First,  (<£=)  is  true 
since  the  entries  of  u(x)  are  signed  determinants  of  8  x  8  submatrices  of  A(x),  which 
are  necessarily  zero  by  the  linear  dependence  of  the  rows.  For  (=>),  we  have  that 
0  =  (y,  0)  =  (■ y,u(x ))  =  det (B(x,y))  for  all  y  G  M9.  That  is,  even  if  y  is  nonzero  and 
orthogonal  to  the  rows  of  A(x),  the  rows  of  B(x,y )  are  linearly  dependent,  and  so 
the  rows  of  A(x)  must  be  linearly  dependent.  This  proves  the  intermediate  claim. 

We  now  use  the  claim  to  prove  the  result.  The  entries  of  u(x)  are  coordinates 
of  a  matrix  U(x)  G  IHI3x3  in  the  same  basis  as  before.  Note  that  the  entries  of  U(x) 
are  polynomials  of  x.  Furthermore,  A  is  injective  if  and  only  if  det  U(x)  ^  0.  To  see 
this,  observe  three  cases: 

Case  I:  U(x)  =  0,  i.e.,  u(x)  =  0,  or  equivalently,  dim  null  (A  (x))  >  2.  By  the 
HMW  test,  A  is  not  injective. 

Case  II:  The  null  space  is  spanned  by  U(x)  ^  0,  but  det  U(x)  =  0.  By  the 
HMW  test,  A  is  not  injective. 

Case  III:  The  null  space  is  spanned  by  U(x)  ^  0,  and  detU(x)  ^  0.  By  the 
HMW  test,  A  is  injective. 

Defining  the  real  algebraic  variety  V  =  {x  :  det  U(x)  =  0}  C  M48,  we  then 
have  that  A  is  injective  precisely  when  x  G  Vc.  Since  Vc  is  Zariski-open,  it  is  either 
empty  or  dense  with  full  measure,  but  it  is  nonempty  since  (8)  passes  the  HMW  test. 
Therefore,  Vc  is  dense  with  full  measure.  □ 

To  be  clear,  this  result  has  since  been  proven  as  part  of  a  larger  class  of  ensem¬ 
bles  for  which  the  conjecture  holds,  namely,  the  case  M  =  2m  +  1  for  any  positive 
integer  m  [38].  In  fact,  Conca,  Edidin,  Hering  and  Vinzant  prove  much  more: 
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Theorem  2.13  (Theorem  1.1  and  Proposition  5.4  in  [38]). 


(a)  If  m  is  any  positive  integer,  then  part  (a)  of  the  4 M  —  4  Conjecture  is  true  for 

M  =  2m  +  1 . 

(b)  Part  (b)  of  the  4 M  —  4  Conjecture  is  true. 

As  a  consequence  of  Theorems  2.10  and  2.13,  the  first  remaining  open  case  of 
the  4 M  —  4  Conjecture  is  M  =  4. 

Now  recall  Wright’s  conjecture:  there  exist  unitary  matrices  U i ,  U2  and  C/3 
such  that  $  =  [Ui  U2  C/3]  yields  injective  intensity  measurements.  Also  recall  that 
Wright’s  conjecture  implies  N*(M )  <  3 M  —  2.  Again,  both  of  these  were  disproved 
by  Heinosaari  et  al.  [62]  using  deep  results  in  differential  geometry.  Alternatively, 
Theorem  2.12  also  disproves  these  in  the  case  where  M  =  3,  since  At* (3)  =  4(3)  —3  = 
8  >  7  =  3(3)  -  2. 

Note  that  the  HMW  test  can  be  used  to  test  for  injectivity  in  three  dimensions 
regardless  of  the  number  of  measurement  vectors.  As  such,  it  can  be  used  to  evaluate 
ensembles  of  3  x  3  unitary  matrices  for  quantum  mechanics.  For  example,  consider 
the  3x3  fractional  discrete  Fourier  transform,  defined  in  [22]  using  discrete  Hermite- 
Gaussian  functions: 
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It  can  be  shown  by  the  HMW  test  that  <f>  =  [I  F 1//2  F  F 3//2]  yields  injective  intensity 
measurements.  This  leads  to  the  following  refinement  of  Wright’s  conjecture: 
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Conjecture  2.14.  Let  F  denote  the  M  x  M  discrete  fractional  Fourier  transform 
defined  in  [22].  Then  for  every  M  >  3  ,  $  =  [I  F 1/2  F  F 3/2]  yields  injective  intensity 
measurements. 

This  conjecture  can  be  viewed  as  the  discrete  analog  to  the  work  of  Jaming  [67], 
in  which  ensembles  of  continuous  fractional  Fourier  transforms  are  evaluated  for 
injectivity. 

2. 3  Achieving  injectivity  with  4 M  —  4  intensity  measurements 

In  this  section,  we  provide  an  ensemble  of  AM  —  4  measurement  vectors  which 
yield  injective  intensity  measurements  for  CM.  The  vectors  in  this  ensemble  are 
modulated  discrete  cosine  functions,  and  they  are  explicitly  constructed  at  the  end  of 
this  section.  We  start  here  by  motivating  the  construction,  specifically  by  identifying 
the  significance  of  circular  autocorrelation ,  which  we  define  in  (9)  below. 

Consider  the  P-dimensional  complex  vector  space 

£(7, P)  :=  {u:  Z  — >  C  :  u{p  +  P)  =  u{p)1  Wp  G  Z}. 

The  discrete  Fourier  basis  in  £{[LP)  is  the  sequence  of  P  vectors  {/« j}?ezP  defined 
by  fq(p)  :=  e2mpq/p  (the  notation  uq  G  Z p”  is  taken  to  mean  a  set  of  coset  repre¬ 
sentatives  of  Z  with  respect  to  the  subgroup  P Z).  The  discrete  Fourier  transform 
(DFT)  on  Z P  is  F* :  £(ZP)  — >  £(ZP),  with  corresponding  inverse  DFT  (F*)_1  =  ^F, 
defined  by 


(F ■«)(«)  =  <«,/,)  =  u(p)e-2’i™/p, 

p&Zp 

( Fv)(p )  =  v{q)fq(p)  =  Y  v{q)e2mpq/p. 
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Now  let  Tp :  £(ZP)  — >  £(ZP)  be  the  translation  operator  ( [Tpu)(p ')  :=  u(p'  —  p).  The 
circular  autocorrelation  of  u  is  then  CirAut  (u)  G  £{ZP),  defined  entrywise  by 


(CirAut  (a)  )(p)  :=  (u,Tpu)  =  ^  u{jp')u{jpf  —  p).  (9) 

p'&jp 

Consider  the  DFT  of  a  circular  autocorrelation: 

(F*  CirAut  (u))(g)  =  E  E  u{p')u{p'  —  p)e  2mpq/p 

peZp  p'GZp 

=  u{p')e~2nip'q/p 

p'GZp 

=  V  u(p')e-2-‘"'</pi  Y  =  K«,/,)|2.  (10) 

p'&Zp  'V'eZp  ' 

As  such,  if  one  has  the  intensity  measurements  {|(w,  fq)\2}qezP,  then  one  may  com¬ 
pute  the  circular  autocorrelation  CirAut  (w)  by  applying  the  inverse  DFT.  In  order 
to  perform  phase  retrieval  from  {|(u,  fq) \2}qezp,  it  therefore  suffices  to  determine  u 
from  CirAut  (w).  This  is  the  motivation  for  the  approach  in  this  section. 

To  see  how  to  “invert”  CirAut,  let’s  consider  an  example.  Take  x  =  (a,  b,  c)  G 
C3  and  consider  the  circular  autocorrelation  of  a;  as  a  signal  in  £(Z3): 

CirAutfT)  =  ( | a. | 2  +  |6|2  +  |c|2,  ac  +  ba  +  cb,  ab  +  be  +  ca). 

Notice  that  every  entry  of  CirAut  (x)  is  a  nonlinear  combination  of  the  entries  of  x, 
from  which  it  is  unclear  how  to  compute  the  entries  of  x.  To  simplify  the  structure, 
we  pad  x  with  zeros  and  enforce  even  symmetry;  then  the  circular  autocorrelation 
of  u  :=  (2a,  b,  c,  0, 0,  0,  0,  c,  b)  G  £(Z9)  is 

CirAut(w)  =  (4|a|2  +  \b\2  +  |c|2,  2  Re(2 ab  +  be),  \b\2  +  4  Re(ac),  2  Re(6c),  |c|2, 

|c|2,  2  Re(6c),  |6|2  +  4  Re(ac),  2  Re(2a6  +  be)).  (11) 


A 

p&Zp 


u 


(pi  —  p)e~2'n'i(.p'-p)q/P 
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Although  it  still  appears  rather  complicated,  this  circular  autocorrelation  actually 
lends  itself  well  to  recovering  the  entries  of  x. 

Before  explaining  this  further,  first  note  that  9  =  4(3)— 3,  and  we  can  generalize 
our  mapping  x  H >  u  by  sending  vectors  in  CM  to  members  of  t?(Z4A,/_3).  To  make  this 
clear,  consider  the  reversal  operator  R  :  £(Zp)  — >  f'(Zp)  defined  by  ( Ru)(p )  =  u(—p). 
Then  given  a  vector  x  G  CM,  padding  with  zeros  and  enforcing  even  symmetry  is 
equivalent  to  embedding  x  in  £(Z4M_3)  by  appending  3 M  —  3  zeros  to  x  and  then 
taking  u  =  x  +  Rx  G  £(Z4M_3).  (From  this  point  forward,  ux”  is  used  to  represent 
both  the  original  signal  in  CM  and  the  version  of  x  embedded  in  £(h 4m-3)  via  zero¬ 
padding;  the  distinction  will  be  clear  from  context.)  Computing  x  G  CM  then 
reduces  to  determining  the  first  M  entries  of  x  G  £(Z4M_3)  from  CirAut(x  +  Rx).  If 
x  is  completely  real-valued,  then  this  is  indeed  possible.  For  instance,  consider  the 
circular  autocorrelation  (11).  If  the  entries  of  x  are  all  real,  then  this  becomes 

CirAut(x  +  Rx)  =  (4a2  +  b2  +  c2,4ab  +  2 be,  b2  +  4ac,  2 be,  c2, 

c2,  2 be,  b2  +  4ac,  4a6  +  2 be)  . 

Since  (CirAut(x  +  i?x))(4)  =  c2,  we  simply  take  a  square  root  to  obtain  c  up  to  a 
sign.  Assuming  c  is  nonzero,  we  then  divide  (CirAut(x  +  i2x))(3)  by  2c  to  determine 
b  up  to  the  same  sign.  Then  subtracting  b2  from  (CirAut(x  +  Rx))(2)  and  dividing 
by  4c  gives  a  up  to  the  same  sign. 

From  this  example,  we  see  that  the  process  of  recovering  the  entries  of  x  from 
CirAut(x  +  Rx)  is  iterative,  working  backward  through  its  first  2 M  —  2  entries.  But 
what  happens  if  c  is  zero?  Fortunately,  this  process  doesn’t  break:  In  this  case,  we 
have 


CirAut(x  +  Rx)  =  (4a2  +  62, 4 ab,  b2,  0,  0,  0,  0,  b2, 4 ab). 
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Thus,  we  need  only  start  with  (CirAut(x  +  Rx))( 2)  to  determine  the  remaining 
entries  of  x  up  to  a  sign.  This  observation  brings  to  light  the  important  role  of  the 
last  nonzero  entry  of  x  in  the  iteration.  The  relationship  between  this  coordinate 
and  the  entries  of  CirAut(x  +  Rx)  will  become  more  rigorous  later. 

The  above  example  illustrated  how  a  real  signal  x  is  determined  by  CirAut(x  + 
Rx).  A  complex- valued  signal,  on  the  other  hand,  is  not  completely  determined  from 
CirAut(x+ Ax).  Luckily,  this  can  be  fixed  by  introducing  a  second  vector  in  £(Z4M_3) 
obtained  from  x,  and  we  will  demonstrate  this  later,  but  for  now  we  focus  on  x  +  Rx. 
To  this  end,  we  first  take  a  closer  look  at  the  entries  of  CirAut(x  +  Rx).  Since  this 
circular  autocorrelation  has  even  symmetry  by  construction,  we  need  only  consider 
all  entries  of  CirAnt(x  +  Rx)  up  to  index  2M  —  2.  This  leads  to  the  following  lemma: 

Lemma  2.15.  Let  x  denote  an  M -dimensional  complex  signal  embedded  in  ^(Z4M_3) 
such  that  x(p)  =  0  for  all  p  —  M,. . . ,  4 M  —  4.  Then 

(CirAnt(x  +  Rx))(p)  =  2Re(x,  Tpx }  +  (x,  RT~px) 

for  all  p  —  1, ,  2 M  —  2. 

Proof.  First  note  that  by  the  definition  of  the  circular  autocorrelation  in  (9)  we  have 

(CirAut(x  +  Rx))(p)  =  (x  +  Rx,  Tp(x  +  Rx)) 

=  2  Re(x,  Tpx)  +  (x,  RT~px)  +  (x,  RTpx). 

Thus,  to  complete  the  proof  it  suffices  to  show  that  (x,  RTpx )  =  0  for  all  p  = 
1, . . . ,  2 M  —  2.  Since  x  is  only  nonzero  in  its  first  M  entries,  we  have 

M— 1  M—l  M—l 

(x,  RTpx)  =  x(p')(RTPx)(p')  =  E  x(p')(TPx)(-p')  =  E  x(p')x(—p'  —  p), 

p'=0  p'= 0  p'= 0 
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where  the  summand  is  zero  whenever  —p'  —  p  [0,  M  —  1]  modulo  4 M  —  3.  This 
is  equivalent  to  having  —  p  not  lie  in  the  Minkowski  sum  p'  +  [0,M  —  1],  and  since 
p'  G  [0,  M  —  1]  we  see  that  (x,  RTpx)  =  0  for  all  p  =  1, ,  2 M  —  2.  □ 

As  a  consequence  of  Lemma  2.15,  the  following  theorem  expresses  the  entries 
of  CirAut(a;  +  Rx)  in  terms  of  the  entries  of  x: 

Theorem  2.16.  Let  x  denote  an  M -dimensional  complex  signal  embedded  in  ^(Z4M_3) 
such  that  x(p)  =  0  for  all  p  =  M , . . . ,  AM  —  4.  Then  we  have 


if  p  is  odd 

(12) 

if  p  is  even 


Proof.  We  first  use  Lemma  2.15  to  get 


(CirAutfT  +  Rx))(p) 


'  ,  M-l  x 

2 Re  (  x(p')(x(p'  —  p)  +  x{p  —  p'))  j 

V  =  2+i 

V  2 

/  M_1  _  _ \ 

2Re(  ^  x(p')(x(p' —  p)  +  x(p  —  p'))  )  +  \x(p)\ 

^p'=f+i 


for  all  p  =  1, . . . ,  2 M  —  2. 


(CirAutfT  +  Rx))(p)  =  2  Re(a:,  Tpx)  +  (x,  RT  px ) 

,  M—l  x  M-l 

=  2  Re  (  x(p)x(p'  —  p)  )  +  x(p)x(p  —  p') 

k  p’=Q  '  p'= o 

,  M-l  x  min{p,M— 1} 

=  2  Re  (  ^  x(p)x(p'  —  p)  J  +  ^  x(p)x(p  —  pi),  (13) 

'  p'—p  '  p'=max{p—  (M— 1) ,0} 

where  the  last  equality  takes  into  account  that  the  first  summand  is  nonzero  only 
when  p'  —  p  G  [0,  M  —  1]  and  the  second  summand  is  nonzero  only  when  p  —  p'  G 
[0,  M  —  l],  i.e.,  when  p'  G  \p,p+  (M  —  1)]  and  p'  G  \p  —  (M  —  1) ,  p] ,  respectively.  To 
continue,  we  divide  our  analysis  into  cases. 
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For  p  —  1, . . . ,  M  —  1,  (13)  gives 


(CirAut(x  +  Rx))(p ) 


M—l 


2  Re  |  y^  x{p')x(p'  —  p)  j  +  x(p)x(p  —  p'). 

p'=0 


(14) 


If  p  is  odd  we  can  then  write 


p 

x(p')x(p 

p'= 0 


p')  =  y^  x{p)x(p  —  p')  +  x(p')x(p  —  p’) 

P,= 0  p'=2±i 

p  _  p  _ 

=  x(p  —  p")x(p”)  +  y^  x{p)x{j)  —  p’) 

p"  =  P±l  pl=v±t 


(15) 


while  if  p  is  even  we  similarly  write 


p'=0 


x(p')x(p  —  p')  —  2  Re  (  x(p')x(p  —  p')  j  +  |a:(f ) 


(16) 


Substituting  (15)  and  (16)  into  (14)  then  gives  (12). 

For  the  remaining  case,  p  =  M, . . . ,  2 M  —  2  and  (13)  gives 


M—l 

(CirAut  (x  +  Rx))(p)  =  y^  x(p')x(p  —  p').  (17) 

p'—p—  (M— 1) 

Similar  to  the  previous  case,  taking  p  to  be  odd  yields 

M—l  ,  M—l  \ 

yy  x(p')x(p  —  p')  =  2  Re  (  yy  x(p')x{p  —  p')  ) ,  (18) 

p'=p—(M—l)  ' 
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while  taking  p  to  be  even  yields 


M—l  ,  M—l  x 

^  x(p')x(p  —  p')  =  2Re  (  ^  x(p')x(p  —  p')  j  +  |^(f)  |2  ,  (19) 

p'=p-(m- i)  V=e+i  ' 

and  substituting  (18)  and  (19)  into  (17)  also  gives  (12).  □ 

Notice  (12)  shows  that  each  member  of  {(CirAut(a;  +  Rx))(jp)}2p=\2  can  be 
written  as  a  combination  of  the  first  M  entries  of  x,  but  only  those  at  or  beyond  the 
("|]th  index.  As  such,  the  index  of  the  last  nonzero  entry  of  x  is  closely  related  to 
that  of  the  last  nonzero  entry  of  { ( CirAut  (a;  +  Rx) )  (p)  }p={~2 ■  This  corresponds  to  the 
observation  earlier  in  the  case  of  x  €  M3  where  the  third  coordinate  was  assumed  to 
be  zero.  We  identify  the  relationship  between  the  locations  of  these  nonzero  entries 
in  the  following  lemma: 

Lemma  2.17.  Let  x  denote  an  M -dimensional  complex  signal  embedded  in  £(Zj4m- 3 ) 
such  that  x(p)  =  0  for  all  p  =  M, . . .  ,4 M  —  4.  Then  the  last  nonzero  entry  of 
{(CirAut (x+Rx))(p)}2^(f~2  has  index  p  =  2 q,  where  q  is  the  index  of  the  last  nonzero 
entry  of  x. 

Proof.  If  q  >  1,  then  (12)  gives  that  (CirAut(a:  +  Rx))(2q)  =  \x(q)\2  7^  0.  Note 
that  since  x(p')  =  0  for  every  p'  >  q ,  (12)  also  gives  that  (CirAut(a:  +  Rx))(p)  =  0 
for  every  p  >  2 q.  For  the  remaining  case  where  q  =  0,  (12)  immediately  gives  that 
(CirAut(a:  +  Rx))(p)  =  0  for  every  p  >  1.  To  show  that  (CirAut(a;  +  i?a:))(0)  ^  0  in 
this  case,  we  apply  the  definition  of  circular  autocorrelation  in  (9): 

(CirAut(a;  +  i?a:))(0)  =  (x  +  Rx,  x  +  Rx)  =  \\x  +  Rx ||2  =  |2a;(0)|J  7^  0, 

where  the  last  equality  uses  the  fact  that  x  is  only  supported  at  0  (since  q  =  0).  □ 

As  previously  mentioned,  we  are  unable  to  recover  the  entries  of  a  complex 
signal  x  solely  from  CirAut (x  +  Rx).  One  way  to  address  this  is  to  rotate  the  entries 
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of  x  in  the  complex  plane  and  also  take  the  circular  autocorrelation  of  this  modified 
signal.  If  we  rotate  by  an  angle  which  is  not  an  integer  multiple  of  7r,  this  will 
produce  new  entries  which  are  linearly  independent  from  the  corresponding  entries 
of  x  when  viewed  as  vectors  in  the  complex  plane.  As  we  will  see,  the  problem  of 
recovering  the  entries  of  x  then  reduces  to  solving  a  linear  system. 

Take  any  (AM  —  3)  x  (AM  —  3)  diagonal  modulation  operator  E  whose  diagonal 
entries  {cOk}t=o~4  are  °f  unit-modulus  satisfying  UjUE  ^  K-  for  all  j  ^  k  and  consider 
the  new  vector  Ex  G  £(Z4m- 3)-  Then  Theorem  2.16  gives 


(CirAut(£b;  +  REx))(p ) 


M—l 


2  Re  £  up’x(p')(up'-px(p  -p)  +  Up_p'X(p 


P'=®# 

M—l 


2  Re  £  Up'X(p')(ujp/_px(pt  -p)+  U)p—pi x (p 


p'= 1+1 


if  p  is  odd 

if  p  is  even 
(20) 


for  all  p  =  1, . . . ,  2M  —  2.  We  will  see  that  (12)  and  (20)  together  allow  us  to  solve  for 
the  entries  of  x  (up  to  a  global  phase  factor)  by  working  iteratively  backward  through 
the  entries  of  CirAut(a:  +  Rx)  and  CirAut  (Ex  +  REx).  As  alluded  to  earlier,  each 
entry  index  forms  a  linear  system  which  can  be  solved  using  the  following  lemma: 

Lemma  2.18.  Let  a,  b  G  C  \  {0}  and  uj  G  C  \  M  with  |ca|  =  1.  Then 

b  =  = - ( Re(o jab)  —  u  R e(ab)) .  (21) 

alm(a;) 
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Proof.  By  direct  manipulation,  we  have 


R ,e(ujab)  —  u  R e(ab)  =  Re(u;)  Re(ab)  —  Im (a;)  lm(ah)  —  uj  Re(ah) 

=  —  i  Im(a;)  ( Re(a6)  —  ilm(a&)) 

=  — i  Im(u;)  ( Re(afe)  +  i  Im(a6)) 

=  — ia6Im(a;). 

Rearranging  then  yields  the  desired  result.  □ 

We  now  use  this  lemma  to  describe  how  to  recover  x  up  to  global  phase.  By 
Lemma  2.17,  the  last  nonzero  entry  of  {(CirAut(x  +  Rx))(p)}2^2  has  index  p  =  2 q, 
where  q  indexes  the  last  nonzero  entry  of  x.  As  such,  we  know  that  x(k)  =  0  for  every 
k  >  q,  and  x(q)  can  be  estimated  up  to  a  phase  factor  (x(q)  =  e1'^x(q))  by  taking 
the  square  root  of  (CirAut(x  +  Rx))(2q)  =  \x(q)\2  (we  will  verify  this  soon,  but  this 
corresponds  to  the  examples  we  have  seen  so  far).  Next,  if  we  know  R e(x(q)x(k)) 
and  Re(ujquJkx(q)x(k))  for  some  k  <  q,  then  we  can  use  these  to  estimate  x(k): 

x(k)  := - - - (R e(ojquJkx(q)x(k))  —  uqUJk  Re(x(q)x(k))  j  =  e^x^k),  (22) 

x(q)  lm(uqu^)  v  ' 

where  the  last  equality  follows  from  substituting  a  =  x(q),  b  =  x(k)  and  u  =  coqcJk 
into  (21).  Overall,  once  we  know  x(q)  up  to  phase,  we  can  then  find  x(k)  relative 
to  this  same  phase  for  each  k  =  0, . . .  ,q  —  1,  provided  we  know  Re(x(q)x(k))  and 
RefjjquJkx{q)x{k))  for  these  k' s.  Thankfully,  these  values  can  be  determined  from 
the  entries  of  CirAut(x  +  Rx)  and  CirAut^Ax  +  REx ): 

Theorem  2.19.  Let  x  denote  an  M -dimensional  complex  signal  embedded  in  £(Z4m-3) 
such  that  x(p)  =  0  for  all  p  =  M, . . . ,  AM  —  4  and  E  be  a  (AM  —  3)  x  (AM  —  3) 
diagonal  modulation  operator  with  diagonal  entries  {cok}^^4  satisfying  (04 1  =  1  for 
all  k  =  0, ... ,  AM  —  A  and  uijUk  ^  K  for  all  j  ^  k.  Then  x  can  be  recovered  up  to  a 
global  phase  factor  from  CirAut(x  +  Rx)  and  CirAut(Ax  +  REx). 
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Proof.  Letting  q  denote  the  index  of  the  last  nonzero  entry  of  x,  it  suffices  to  estimate 
{x(k)}gk= o  up  to  a  global  phase  factor.  To  this  end,  recall  from  Lemma  2.17  that 
the  last  nonzero  entry  of  {(CirAut(x  +  Rx) ) (jp) }p=f~2  has  index  p  =  2q.  If  q  —  0, 
then  we  have  already  seen  that  (CirAut(x  +  i2x))(0)  =  4|a:(0)|J.  Since  there  exists 
if  G  [0,27 r)  such  that  x(0)  =  e-1^|a;(0)|,  we  may  take 

x(0)  :=  \ a/ (CirAut(x  +  Rx))( 0)  =  |x(0)|  =  e^x(0). 

Otherwise  q  G  [1,  M  —  1],  and  (12)  gives 


(CirAut(x  +  Rx))(2q) 

/  M— 1 

=  \x(q)\2  +  2Re  £  x(p')(x(p'  —  2  q)  +  x{2  q 

V=g+ 1 


x(q)  I"'  ■ 


Thus,  taking  x(q)  :=  \J (CirAut(x  +  Rx))(2q)  =  |x(g)|  gives  us  x(q)  =  e'^x(q)  for 
some  if  G  [0,  2n). 

In  the  case  where  q  —  1,  all  that  remains  to  determine  is  £(0),  a  calculation 
which  we  save  for  the  end  of  the  proof.  For  now,  suppose  q  >  2.  Since  we  already 
know  x(q)  =  e1^x(q),  we  would  like  to  determine  x{k)  for  k  —  1,,,. . . ,  q  —  1.  To  this 
end,  take  r  G  [0,  q  —  2}  and  suppose  we  have  x(k)  =  e1^x(k)  for  all  k  —  q  —  r, . . .  ,q. 
If  we  can  obtain  x(q  —  (r  +  1))  up  to  the  same  phase  from  this  information,  then 
working  iteratively  from  r  =  0  to  r  =  q  —  2  will  give  us  x(k)  up  to  global  phase  for 
all  but  the  zeroth  entry  (which  we  address  later).  Note  when  r  is  even,  (12)  gives 


(CirAut(x  +  Rx))(2q  —  (■ r  +  1)) 

=  2  Re  (  ^  x(p){x(p’  —  (2  q  —  (r  +  1)))  +  x((2q  —  (r  +  1))  —  p'))\ 

V=9-§  ' 

q- 1 

=  2  Re  (x(q)x(q  —  (r  +  1))  j  +  2  ^  Re  (x{p’)x({2q  —  (r  +  1))  —  p')  j  , 

p’=q- 1 
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where  the  last  equality  follows  from  the  observation  that 


p  —  (2 q  —  (r  +  1))  <  —  q  +  (r  +  1)  <  — 1 

over  the  range  of  the  sum,  meaning  x (//  —  (2 q  —  (r  +  1)))  =  0  throughout  the  sum. 
Similarly  when  r  is  odd,  (12)  gives 

(CirAut(x  +  Rx))(2q  —  (r  +  1))  =  2  Re  ^x(q)x(q  —  (r  +  1))  j  +  | x[q  —  |" 

q- 1 

+  2  ^  Re  (x{p')x{{2q  —  (r  +  1))  —  . 

P^q-^ 

In  either  case,  we  can  isolate  R e(x(q)x(q  —  (■ r  +  1)))  to  get  an  expression  in  terms  of 
(CirAut(x  +  Rx))(2q  —  (r  +  1))  and  other  terms  of  the  form  R e(x(k)x(k'))  or  \x[k)\ 1 
for  k,k'  G  [q  —  r,  q  —  1],  By  the  induction  hypothesis,  we  have  x{k)  =  e1^x(k)  for 
k  =  q  —  r, ...  ,q  —  1,  and  so  we  can  use  these  estimates  to  determine  these  other 
terms: 


Re(x(k)x(k'))  =  R  e(e1^x(k)ei^x(k'))  =  R  e(x(k)x(k')), 

\x(k)\2  =  |e^a;(A;)|2  =  \x(k)\2 . 

As  such,  we  can  use  (CirAut(x  +  Rx)){2q  —  (r  +  1))  along  with  the  higher-indexed 
estimates  x(k)  to  determine  the  term  R e(x(q)x(q  —  (r  +  1))).  Similarly,  we  can  use 
(CirAutfAx  +  REx)){2q  —  (r  +  1))  along  with  the  higher-indexed  estimates  x(k)  to 
determine  Re(uquJq^pRRyjx(q)x(q  —  (r  +  1))).  We  then  plug  these  into  (22),  along 
with  the  estimate  x(q)  =  e^x^q)  (which  is  also  available  by  the  induction  hypothesis), 
to  get  x(2 q  —  (■ r  +  1))  =  e^x(2 q  —  (■ r  +  1)). 

At  this  point,  we  have  determined  {x(k)}qk=1  up  to  a  global  phase  factor  when¬ 
ever  q  >  1,  and  so  it  remains  to  find  x(0).  For  this,  note  that  when  q  is  odd,  (12) 
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gives 


q- 1 

(CirAut(x  +  Rx))(q)  =  4Re(x(g)x(0))  +  2  Y^  Re  (x(p')x(q  —  p')^j  , 

p'=q-i - 

while  for  even  q,  we  have 

q-l 

(CirAut (x  +  Rx))(q)  —  4 Re(x(?)x(0))  +  2  Y^  Re  [x{p')x{q  —  p')^  +  |a;(|)|2. 

p'= l+i 

As  before,  isolating  Re(x(g)x(0))  in  either  case  produces  an  expression  in  terms 
of  (CirAut(x  +  Rx))(q)  and  other  terms  of  the  form  Re(x(k)x(k'))  or  \x(k)\2  for 
k,  k!  G  [1,  q  —  1].  These  other  terms  can  be  calculated  using  the  estimates  {£'(/e)}fc=i; 
and  so  we  can  also  calculate  Re(x(g)x(0))  from  (CirAut(x  +  Rx))(q).  Similarly,  we 
can  calculate  Re(a;goiox(g)a;(0))  from  {x(A)}fc=i  and  (CirAut(Far  +  REx))(q),  and 
plugging  these  into  (22)  along  with  x(q)  produces  the  estimate  £(0)  =  e1^a:(0).  □ 

Theorem  2.19  establishes  that  it  is  possible  to  recover  a  signal  x  e  CA/  up  to  a 
global  phase  from  { (CirAut (x  +  Rx))(q)}2^2  and  {(CirAut {Ex  +  REx)){q)}2q=Q2  ■ 
We  now  return  to  how  these  circular  autocorrelations  relate  to  intensity  measure¬ 
ments.  Recall  from  (10)  that  the  DFT  of  the  circular  autocorrelation  is  the  modulus 
squared  of  the  DFT  of  the  original  signal:  (F*  CirAut (u))(q)  =  \(F*u)(q)\2 .  Also 
note  that  the  DFT  commutes  with  the  reversal  operator: 

(F*Ru)(q)  =  Y  u{-p)e~2iripq/p  =  Y  u(p')e-2nip'{-q)/p 
=  (F*u){-q)  =  (. RF*u)(q ). 
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With  this,  we  can  express  CirAut(x  +  Rx)  in  terms  of  intensity  measurements  with 
a  particular  ensemble: 

(F*  CirAut(x  +  Rx))(q)  =  \ (F*(x  +  Rx))(q) \2  =  | (F*x)(q)  +  (F*Rx)(q)\2 

=  \(F*x)(q)  +  (F*x)(-q)\2  =  \(x,fq  +  f-q)\2. 

Defining  the  qth  discrete  cosine  function  cq  G  £(Z4m_3)  by 

c,(p)  :=  2  cos  (Jg.)  =  e2'”niM-3)  +  e-2™/<4»-3)  =  (/,  +  /_,)(p), 

this  means  that  (F*  CirAut(x  +  Rx))(q)  =  \(x,cq)\2  for  all  q  G  Z4m-3-  Similarly,  if 
we  take  the  modulation  matrix  E  to  have  diagonal  entries  04  =  e27rlfc/(2A/-1)  for  a]} 
k  —  0, . . . ,  AM  —  4,  we  find 

(F*  CirAut (Ex  +  REx))(q)  =  \(Ex,cq}\2  =  \(x,  E*cq)\2 . 

Thus,  coupling  the  DFT  with  Theorem  2.19  allows  us  to  recover  the  signal  x  from 
AM  —  2  intensity  measurements,  namely  with  the  ensemble  {cq}2ff^2  U  {E*cq}2^0~2. 
Note  that  since  x  G  £(Z4m-3)  is  actually  a  zero-padded  version  of  x  G  CM,  we 
may  view  cq  and  E*cq  as  members  of  CA/  by  discarding  the  entries  indexed  by 
p  —  M,...,  AM  -  4. 

Considering  this  section  promised  phase  retrieval  from  only  AM  —  A  intensity 
measurements,  we  must  somehow  find  a  way  to  discard  two  of  these  AM  —  2  mea- 
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surement  vectors.  To  do  this,  first  note  that 


(CirAut  (Ax  +  REx))(  0) 


\\Ex  +  REx\\2 

|e2Wi*/(2M-l)a.^J  +  e2ni(-k)/(2M-l)x(_k^  | 

feez4M_3 


-1 

E 

e27ri(-fc)/(2M-l)x 

k=—(2M—2) 

2M-2 

+  E  1 

e2Wi*/(2M-l)a.(A.) 

fc=i 


|x  +  Ax||2  =  CirAut(x  +  Rx)(0). 


Moreover,  we  have 

(CirAut  (Ex  +  REx))(2M  —  2) 

=  (Ex  +  REx){k){Ex  +  A  Ax)  (A;  -  (2 M  -  2)) 

^^4M-3 

=  (AT  +  REx)(M  —  l)(Ax  +  REx)(—(M  —  1)) 

=  (Ax  +  REx)(M  —  l)(Ax  +  REx)(M  —  1), 

where  the  last  equality  is  by  even  symmetry.  Since  a:  is  only  supported  on  k 
0, . . , ,  M  —  1,  we  then  have 


(CirAut (Ax  +  REx))(2M  -2)  =  \{Ex  +  REx)(M  -  1)|2 

=  |e2-(M-l)/(2M-l)x(M  _  !)  +  e-2.i(Mm)/(2M— ^ 
=  |e2Wi(Af-l)/(2M-l)a.^M  _  1)|2 

=  |x(M  —  1)|J  =  (CirAut(x  +  Rx))(2M  —  2). 
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Furthermore,  the  even  symmetry  of  the  circular  autocorrelation  also  gives 

(CirAut  (Ex  +  REx))(—(2M  —  2))  =  (CirAut  (Ex  +  REx))(2M  —  2) 

=  (CirAut(x  +  Rx))(2M  —  2)  =  (CirAut(x  +  Rx))(—(2M  —  2)). 

These  redundancies  between  CirAuttT+ihr)  and  CirAut (Ex+REx)  indicate  that  we 
might  be  able  to  remove  measurement  vectors  from  our  ensemble  while  maintaining 
our  ability  to  perform  phase  retrieval.  The  following  theorem  confirms  this  suspicion: 

Theorem  2.20.  Let  cq  G  CM  be  the  truncated  discrete  cosine  function  defined  by 
cq(p )  :=  2  cos(42^3)  for  all  p  —  0, . . . ,  M  —  1,  and  let  E  be  the  M  x  M  diagonal 

modulation  operator  with  diagonal  entries  Uk  =  e27nfc/(2M~1)  for  all  k  —  0, _ ,  M  —  1. 

Then  the  intensity  measurement  mapping  A:  CM/T  — >  M4A/-4  defined  by  A(x)  := 
{\(x,cq)\%*f0~2  U  {\{x,E*cq)\2}™fi*  ts  injective. 

Proof.  Since  Theorem  2.19  allows  us  to  reconstruct  any  x  G  CM  up  to  a  global  phase 
factor  from  the  entries  of  CirAut  (x  +  Rx)  and  CirAut  (Ex  +  REx),  it  suffices  to  show 
that  the  intensity  measurements  {| (x,cq)\2}2^j~2  U  {\(x,  E*cq)\2}2=fi3  allow  us  to 
recover  the  entries  of  these  circular  autocorrelations.  To  this  end,  recall  from  (10) 
that  these  quantities  are  related  through  the  inverse  DFT: 

CirAut  (x  +  Rx)  =  (F*)-1{|(a;,cg)j-}geZ4M_3, 

CirAut  (Ex  +  REx)  =  (T*)_1{|(a;,  E*cq)\2}q£7,4M_3. 

Since  we  have  {|(a;,  cq)  | 2 }q=i)~2 ,  we  can  exploit  even  symmetry  to  determine  the  rest  of 
{|(x,  cq) \2}qez4M_3,  and  then  apply  the  inverse  DFT  to  get  CirAut (x+Rx).  Moreover, 
by  the  previous  discussion,  we  also  obtain  the  0,  2 M  —  2,  and  —(2 M  —  2)  entries  of 
CirAut  (Ex  +  REx)  from  the  corresponding  entries  of  CirAut  (x  +  Rx).  Organize  this 
information  about  CirAut  (Ex  +  REx)  into  a  vector  w  G  f'(Z4M_3)  whose  0,  2M  —  2, 
and  —  (2M  —  2)  entries  come  from  CirAut  {Ex  +  REx)  and  whose  remaining  entries 
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are  populated  by  even  symmetry  from  {|(x,  E*cq)  l2}^-3.  We  can  express  w  as  a 
matrix-vector  product  w  =  A{\(x,  E*cq)\2}qGz4M_3,  where  A  is  the  identity  matrix 
with  the  0,  2 M  —  2,  and  —(2 M  —  2)  rows  replaced  by  the  corresponding  rows  of  the 
inverse  DFT  matrix.  To  complete  the  proof,  it  suffices  to  show  that  the  matrix  A  is 
invertible,  since  this  would  imply  CirAut(Fa;  +  REx)  =  (F*)~1A~1w. 

Using  the  cofactor  expansion,  note  that  det(A)  reduces  to  a  determinant  of  a 
3x3  submatrix  of  (F*)_1.  Specifically,  letting  6  :=  2tt(2M  —  2)2/(4M  —  3)  we  have 


( 

111 

\ 

det(A)  =  det 

1  eie  e~w 

V 

1  e~w  ei0 

/ 

_2i  9 


-2i  0\ 


At I 


-W 


)  +  (e 


—W 


=  (ei6>  +  e~ie  -  2)(eie  -  e~w )  =  4i(cos(d)  -  1)  sin(d), 


and  so  A  is  invertible  if  and  only  if  cos(d)  —  1  ^  0  and  sin($)  ^  0.  This  is  equivalent 
to  having  7r  not  divide  9 ,  and  indeed,  the  ratio 

6  2(2M  —  2)2  w  5  1 

-  =  - —  =  2  M - 1 - 

7T  AM  -  3  2  2(4M  -  3) 

is  not  an  integer  because  M  >  2.  As  such,  A  is  invertible.  □ 

The  following  summarizes  the  measurement  design  and  phase  retrieval  proce¬ 
dure  described  in  this  section: 

Measurement  design 

•  Define  the  qth  truncated  discrete  cosine  function  cq  :=  {2  cos ( 4^_93 ) }plo 1 

•  Define  the  M  x  M  diagonal  matrix  E  with  entries  04.  :=  e27rlfc/(2A/-1)  for  a]} 

k  =  0,  ...,M  —  1 

.  Take  <h  :=  {cj^2  U  {. E*cq }2ff3 

Phase  retrieval  procedure 
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•  Calculate  {|(a;,  cg)|2}gez4M_3  from  {|(x,  c9)|2}^-2  by  even  extension 

•  Calculate  CirAut(x  +  Rx)  =  (F*)^1{\(x,  cq)\2}q£z4M_3 

•  Define  w  €  P^ams)  so  that  its  0,  2 M  —  2,  and  —(2 M  —  2)  entries  are  the 
corresponding  entries  in  CirAut(x+i?x)  and  its  remaining  entries  are  populated 
by  even  symmetry  from  {\(x,  E*cq)\2}^^3 

•  Define  A  to  be  the  identity  matrix  with  the  0,  2 M  —  2,  and  —(2 M  —  2)  rows 
replaced  by  the  corresponding  rows  of  the  inverse  DFT  matrix  ( F *)-1 

•  Calculate  CirAut (Ex  +  REx)  =  (F*)~1A~1w 

•  Recover  x  up  to  global  phase  from  CirAut  (x  +  Rx)  and  CirAut  (Ex  +  REx) 
using  the  process  described  in  the  proof  of  Theorem  2.19 
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III.  Almost  injective  intensity  measurements  and 

the  computational  limits  of  phase  retrieval 

Now  that  we  have  a  better  understanding  of  injectivity  in  phase  retrieval,  it  is  natu¬ 
ral  to  ask  how  much  we  might  lose  if  we  reduce  the  size  of  the  measurement  ensemble 
$  =  {< Pn)n=\  —  V ■>  where  V  —  K M  or  CM,  below  the  known  and  conjectured  lower 
bounds  (2 M  —  1  for  the  real  case  and  AM  —  4  for  the  complex  case,  respectively). 
Indeed,  reducing  the  number  of  measurements  is  often  desirable  in  practice  as  each 
measurement  typically  incurs  some  sort  of  cost.  For  instance,  in  the  case  of  syn¬ 
thetic  aperture  radar,  the  number  of  measurements  is  proportional  to  the  number 
of  aircraft  employed  in  the  multistatic  system,  each  of  which  contributes  costs  in 
energy  and  maintenance.  Perhaps  surprisingly,  we  can  often  decrease  the  number 
of  measurements  without  losing  much:  As  we  will  see,  almost  every  signal  can  be 
completely  determined  from  half  the  measurements  required  for  injectivity.  In  this 
chapter,  we  address  this  issue  by  formally  introducing  a  theory  of  almost  injective 
intensity  measurements,  in  which  we  relax  the  injectivity  requirement  to  a  set  of 
signals  that  is  dense  in  RM  (Cm).  For  simplicity,  we  dedicate  the  analysis  of  almost 
injectivity  to  the  real  case,  and  we  conclude  by  examining  algorithmic  efficiency  in 
this  setting. 

3.1  Almost  injectivity 

While  AM  +  o(M)  measurements  are  necessary  and  generically  sufficient  for 
injectivity  in  the  complex  case,  one  can  save  a  factor  of  2  in  the  number  of  measure¬ 
ments  by  slightly  weakening  the  desired  notion  of  injectivity  [8,57].  To  be  explicit, 
we  start  with  the  following  definition: 

Definition  3.1.  Consider  <f>  =  {<pn}n=i  —  where  V  =  RM  or  CM.  The  inten¬ 
sity  measurement  mapping  A:  1 7/S  — »  RN ,  where  S  =  {±1}  (resp.  T),  defined  by 
(A(x))(n)  :=  \{x,ipn)\2  is  said  to  be  almost  injective  if  A~1{A(x))  =  {c ox  :  |cu|  =  1} 
for  almost  every  x  G  V/S. 
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For  the  complex  case,  it  is  known  that  2 M  measurements  are  necessary  for 
almost  injectivity  [57],  and  that  2 M  generic  measurements  suffice  [8]  (cf.  [56]);  this 
is  the  factor-of-2  savings  mentioned  above.  For  the  real  case,  it  is  also  known  how 
many  measurements  are  necessary  and  generically  sufficient  for  almost  injectivity: 
M+l  [8].  Like  the  complex  case,  this  is  also  a  factor-of-2  savings  from  the  injectivity 
requirement:  2 M  —  1.  This  requirement  for  injectivity  in  the  real  case  follows  from 
the  complement  property  characterization  of  injectivity  from  [8]  (Theorem  2.2  of 
this  paper).  Similar  to  this  result,  we  will  characterize  ensembles  of  measurement 
vectors  which  yield  almost  injective  intensity  measurements  and,  similar  to  its  proof, 
the  basic  idea  behind  the  analysis  is  to  consider  sums  and  differences  of  signals 
with  identical  intensity  measurements.  However,  the  characterization  we  develop  is 
limited  to  the  real  case;  a  similar  analysis  for  the  complex  case  remains  an  open 
problem. 

Lemma  3.2.  Consider  <f>  =  {<pn}n=i  —  an( d  th e  intensity  measurement  mapping 

A:  MM/{±1}  — >  RN  defined  by  (^4(x))(n)  :=  \(x,(pn)\2.  Then  A  is  almost  injective 
if  and  only  if  almost  every  x  G  Mm  is  not  in  the  Minkowski  sum  spanks)-1  \  {0}  + 
span(<f>5c)-L  \  {0}  for  all  S  C  {1, ... ,  N}.  More  precisely,  A~l(Afx))  =  {±a;}  if  and 
only  if  x  ^  span(<f>5)-L  \  {0}  +  span(<f>5c)±  \  {0}  for  any  S  C  {1, ... ,  N}. 

Proof.  By  the  definition  of  the  mapping  A,  for  x,  y  G  Mm  we  have  A(x)  =  A(y)  if 
and  only  if  \{x,ipn)\  =  \{y,<pn)\  for  ah  n  G  {1, . . .  ,N}.  This  occurs  precisely  when 
there  is  a  subset  S  C  {1, . . . ,  such  that  (x,  (pn)  =  —(y,  tpn)  for  every  n  G  S  and 
{x,(pn)  =  {y,(pn}  f°r  every  n  G  Sc.  Thus,  ^4_1(^4(a;))  =  {±a:}  if  and  only  if  for 
every  y  ±x  and  for  every  S  C  {1, . . . ,  iV },  either  there  exists  an  n  G  S  such  that 
(x  +  y,  Lpn)  0  or  an  n  G  Sc  such  that  (x  —  y,  (pn)  0.  We  claim  that  this  occurs  if 
and  only  if  x  is  not  in  the  Minkowski  sum  spanks)-1  \  {0}  +  span(<f>5c)-L  \  {0}  for 
all  S  C  {l,...,iV},  which  would  complete  the  proof.  We  verify  the  claim  by  seeking 
the  contrapositive  in  each  direction. 
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(=>)  Suppose  x  G  span(<l)5)-L  \  {0}  +  span^sc)-1  \  {0}.  Then  there  exists 
u  G  spanks)-1  \ {0}  and  v  G  span(<hsc)-L\{0}  such  that  x  =  u+v.  Taking  y  :=  u  —  v, 
we  see  that  x  +  y  =  2u  G  spanks)-1  \  {0}  and  x  —  y  —  2v  G  span^sc)-1  \  {0},  which 
means  that  there  is  no  n  G  S  such  that  ( x  +  y,  <^n)  7^  0  nor  n  G  S"2  such  that 
(a;  —  y,  </3n)  7^  0.  Furthermore,  u  and  v  are  nonzero,  and  so  y  7^  ±x. 

(4=)  Suppose  y  7^  ±x  and  for  every  S  C  {1, . . . ,  IV}  there  is  no  n  E  S  such  that 
(x  +  y,  </?„)  7^  0  nor  n  G  S'0  such  that  (x  —  y,  <^n)  7^  0.  Then  i  +  yG  spanks)-1  \  {0} 
and  x  —  y  G  span(<f>5c)-L  \  {0}.  Since  x  =  \{x  +  y)  +  |(x  —  y),  we  have  that 
x  G  span(<f>5)-L  \  {0}  +  span(<f>5c)-L  \  {0}.  □ 

If  the  mapping  A  is  injective,  then  the  ensemble  $  in  Lemma  3.2  must  satisfy 
the  complement  property,  and  so  the  set  spanks)-1  \  {0}  +  span(<f>5c)-L  \  {0}  is  of 
the  form  V\ {0}  +  0  =  0  regardless  of  the  choice  of  S  C  {1, . . . ,  N}  (here,  V  C  Mm  is 
a  proper  subspace).  Hence,  this  Minkowski  sum  requirement  slightly  weakens  what 
it  means  to  be  injective.  We  will  continue  to  investigate  this  set  with  the  aid  of  the 
following  lemma: 

Lemma  3.3.  Let  U  and  V  be  subspaces  of  a  common  vector  space.  IfUDV  =  {0}, 
then  U  \  {0}  +  V  \  {0}  =  (U  +  V)  \  (U  U  V). 

Proof.  Since  U  \  {0}  +  V  \  {0}  is  a  subset  of  U  +  V,  it  suffices  to  show  that 
( U  \  {0}  +  V\  {0})  fl  (U  U  V)  =  0.  To  this  end,  suppose  x  G  U  \  {0}  +  V  \  {0}. 
Then  x  =  u  +  v  for  some  nonzero  vectors  u  G  U  and  v  G  V.  Since  U  fl  V  —  {0},  it 
follows  that  x  ^  U  and  x  £  V,  that  is,  x  ^  U  U  V.  Likewise,  if  x  G  U  U  V,  then  the 
fact  that  U  fl  V  —  {0}  implies  x  =  u  +  v  for  some  u  G  U  and  v  G  V  where  either  u 
or  v  is  zero.  Hence,  x  ^  U  \  {0}  +  V  \  {0},  completing  the  proof.  □ 

Theorem  3.4.  Consider  <L  =  C  Mm  and  the  intensity  measurement  map¬ 

ping  A:  MM/{±1}  — y  RN  defined  by  (M(x))(n)  :=  \(x,<pn)\2.  Suppose  <L  spans  MM 
and  each  <pn  is  nonzero.  Then  A  is  almost  injective  if  and  only  if  the  Minkowski 
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sum  spanks)-1  +  span($5'c)-L  is  a  proper  subspace  ofRM  for  each  nonempty  proper 
subset  S  C  {1, ... ,  N}. 

Note  that  this  result  is  not  terribly  surprising  considering  Lemma  3.2,  as  the 
new  condition  involves  a  simpler  Minkowski  sum  in  exchange  for  additional  (rea¬ 
sonable  and  testable)  assumptions  on  <f>.  The  proof  of  this  theorem  amounts  to 
measuring  the  difference  between  the  two  Minkowski  sums: 

Proof  of  Theorem  3. 4-  First  note  that  the  spanning  assumption  on  $  implies 

span(<f>5)-L  nspan(dv)1-  =  (span($s)  +  span(<f>5c))±  =  span(<f>)-L  =  {0}, 

and  so  applying  Lemma  3.3  yields  the  following  identity: 

span(<hs)-L  \  {0}  +  span(<f>5c)-L  \  {0} 

=  (span(<f>5)-L  +  span(<f>5c)-L)  \  (span(<f>s)-L  U  span(<f>Sc)-L)  .  (23) 

From  Lemma  3.2  we  know  that  A  is  almost  injective  if  and  only  if  almost  ev¬ 
ery  x  G  MM  is  not  in  the  Minkowski  sum  spanks)-1  \  {0}  +  spari^gc)-1-  \  {0} 
for  any  S  C  {l,...,iV}.  In  other  words,  the  Lebesgue  measure  (which  we  de¬ 
note  by  Leb[-])  of  this  Minkowski  sum  is  zero  for  each  S  C  {1, . . .  ,N}.  By  (23), 
this  equivalently  means  that  the  Lebesgue  measure  of  (span(<f>S')-L  +  span ($50) -1)  \ 
(spanks)-1  U  span(<f>5c)±)  is  zero  for  each  S  C  {1,  Since  spans  Mm, 

this  set  is  empty  (and  therefore  has  Lebesgue  measure  zero)  when  S  =  0  or  S  = 
{1, . . . ,  N}.  Also,  since  each  ipn  is  nonzero,  we  know  that  spanks)-1  and  span ($50) -1 
are  proper  subspaces  of  RM  whenever  S  is  a  nonempty  proper  subset  of  {1, . . . ,  N}, 
and  so  in  these  cases  both  subspaces  must  have  Lebesgue  measure  zero.  As  such,  we 
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have  that  for  every  nonempty  proper  subset  S  C 

Leb  [(span(<f>5)_L  +  span(<f>5c)J-)  \  (span(<f>s)J"  U  span(<f>5c)J-)] 

>  Leb  [span(<f>s)'1  +  span(<f>5c)-L]  —  Leb  [span(<f>5)±]  —  Leb  [span(<f>5c)-L] 

=  Leb  [span(<f>5)-L  +  span(<f>5c)-L] 

>  Leb  [(span(<f>s)J_  +  span(<f>5c)±)  \  (span($s)'L  U  span(<f>5c)-L)]  . 

In  summary,  (span($s)-L  +  span(<I>s'c)-L)\(span(<L5)-L  U  span^sc)-1)  having  Lebesgue 
measure  zero  for  each  S  C  {l,...,iV}  is  equivalent  to  spanks)-1  +  span(<I>5c)±  hav¬ 
ing  Lebesgue  measure  zero  for  each  nonempty  proper  subset  S  C  { 1, . . . ,  iV } ,  which 
in  turn  is  equivalent  to  the  Minkowski  sum  span^sj^  +  span(<I>5c)-L  being  a  proper 
subspace  of  MM  for  each  nonempty  proper  subset  S  C  {1, . . . ,  N},  as  desired.  □ 

At  this  point,  consider  the  following  stronger  restatement  of  Theorem  3.4: 
“Suppose  each  pn  is  nonzero.  Then  A  is  almost  injective  if  and  only  if  $  spans 
Rm  and  the  Minkowski  sum  spanks)-1  +  span(<Lis'c)±  is  a  proper  subspace  of  RM  for 
each  nonempty  proper  subset  S  C  {1, . . . ,  TV}.”  Note  that  we  can  move  the  spanning 
assumption  into  the  condition  because  if  does  not  span,  then  we  can  decompose 
almost  every  x  G  MM  as  x  —  u  +  v  such  that  u  e  span($)  and  v  e  span($)-L 
with  v  7^  0,  and  defining  y  u  —  v  then  gives  A(y)  =  A(x)  despite  the  fact  that 
y  7^  ±x.  As  for  the  assumption  that  the  <pn’s  are  nonzero,  we  note  that  having 
cpn  =  0  amounts  to  having  the  nth  entry  of  A(x)  be  zero  for  all  x.  As  such,  yields 
almost  injectivity  precisely  when  the  nonzero  members  of  together  yield  almost 
injectivity.  With  this  identification,  the  stronger  restatement  of  Theorem  3.4  above 
can  be  viewed  as  a  complete  characterization  of  almost  injectivity.  Next,  we  will 
replace  the  Minkowski  sum  condition  with  a  rather  elegant  condition  involving  the 
ranks  of  'Ls  and  $50  by  applying  the  following  lemma: 

Lemma  3.5  (Inclusion-exclusion  principle  for  subspaces).  Let  U  and  V  be  subspaces 
of  a  common  vector  space.  Then  dim(I7  +  V)  =  dim  U  +  dim  V  —  dim(I7  D  V). 


53 


Proof.  Let  A  be  a  basis  for  Uf)V  and  let  B  and  C  be  bases  for  U  and  V,  respectively, 
such  that  A  C  B  and  ACC.  ft  can  be  shown  that  A  U  B  U  C  forms  a  basis  for 
U  +  V,  which  implies  that 

dim (U  +  V)  =  |A|  +  |JB\A|  +  |C,\Ll|  =  \B\  +  \C\  - \A\  =  dimC/  +  diml/-dim(C/nl/), 

completing  the  proof.  □ 

Theorem  3.6.  Consider  =  {</9n}^=1  C  Mm  and  the  intensity  measurement  map¬ 
ping  A:  MM/{±1}  — *  RN  defined  by  (^f(a;))(n)  :=  \(x,Lpn)\2.  Suppose  each  ipn  is 
nonzero.  Then  A  is  almost  injective  if  and  only  if  <f>  spans  MM  and  rank  $,5  + 
rank<f>5c  >  M  for  each  nonempty  proper  subset  S  C  {1, ... ,  N}. 

Proof.  Considering  the  discussion  after  the  proof  of  Theorem  3.4,  it  suffices  to  assume 
that  <f>  spans  MM.  Furthermore,  considering  Theorem  3.4,  it  suffices  to  characterize 
when  dim  (span(<f>5')-L  +  span(<f>5'c)-L)  <  M .  By  Lemma  3.5,  we  have 

dim  (span(<f>5')±  +  span(<f>5'c)J“) 

=  dim  (span(<f>s')_L)  +  dim  (span(<f>5c)±)  —  dim  (span(<f>s')_L  D  span(<f>5'c)J“)  . 

Since  <f>  is  assumed  to  span  Mm,  we  also  have  that  span(<f>s')-L  fl  span(<f>5c)-L  =  {0}, 
and  so 


dim  (span(<f>5')±  +  span(<f>5c)±) 

=  —  dim  (spanks))  j  +  [m  —  dim  (span(<f>sc))  j  —  0 

=  2  M  —  rank  $5  —  rank  $50. 


As  such,  dim  (span(<f>5)-L  +  span(<f>5c)-L)  <  M  precisely  when  rank  $5  +  rank  $50  > 
M.  □ 
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At  this  point,  we  point  out  some  interesting  consequences  of  Theorem  3.6. 
First  of  all,  <f>  cannot  be  almost  injective  if  N  <  M  +  1  since  rank  $5  +  rank  $50  < 

|  S' |  +  |SC|  =  N.  Also,  in  the  case  where  N  —  M  +  1,  we  note  that  $  is  almost 
injective  precisely  when  <f>  is  full  spark,  that  is,  every  siz e-M  subcollection  is  a 
spanning  set  (note  this  implies  that  all  of  the  if  As  are  nonzero).  In  fact,  every  full 
spark  with  N  >  M  +  1  yields  almost  injective  intensity  measurements,  which 
in  turn  implies  that  a  generic  yields  almost  injectivity  when  N  >  M  +  1  [8]. 
This  is  in  direct  analogy  with  injectivity  in  the  real  case;  here,  injectivity  requires 
N  >  2 M  —  1,  injectivity  with  N  =  2 M  —  1  is  equivalent  to  being  full  spark,  and 
being  full  spark  suffices  for  injectivity  whenever  N  >  2 M  —  1  [8].  Another  thing  to 
check  is  that  the  condition  for  injectivity  implies  the  condition  for  almost  injectivity: 
Since  the  mapping  A  is  injective  for  real  if  and  only  if  $  is  CP,  it  follows  that 
rank  $5  +  rank  $,50  >  M  + 1  >  M  for  every  nonempty  proper  subset  S  C  {1, . . . ,  N}. 

Having  established  that  full  spark  ensembles  of  size  N  >  M  +  1  yield  almost 
injective  intensity  measurements,  we  note  that  checking  whether  a  matrix  is  full 
spark  is  NP-hard  in  general  [70].  Granted,  there  are  a  few  explicit  constructions 
of  full  spark  ensembles  which  can  be  used  [4,81],  but  it  would  be  nice  to  have  a 
condition  which  is  not  computationally  difficult  to  test  in  general.  We  provide  one 
such  condition  in  the  next  theorem,  but  first,  we  briefly  review  the  requisite  frame 
theory. 

A  frame  is  an  ensemble  $  =  {Lpn}^=l  C  Mm  together  with  frame  bounds 
0  <  A  <  B  <  00  with  the  property  that  for  every  x  G 

N 

A||a:||2  <  \(x^n)\2  <  B\\x\\2. 

n= 1 

When  A  =  B,  the  frame  is  said  to  be  tight,  and  such  frames  come  with  a  painless 
reconstruction  formula: 

1  N 

x  =  -jY(x^n^n- 

n=  1 
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To  be  clear,  the  theory  of  frames  originated  in  the  context  of  infinite-dimensional 
Hilbert  spaces  [41,46],  and  frames  have  since  been  studied  in  finite-dimensional  set¬ 
tings,  primarily  because  this  is  the  setting  in  which  they  are  applied  computationally. 
Of  particular  interest  are  so-called  unit  norm  tight  frames  (UNTFs),  which  are  tight 
frames  whose  frame  elements  have  unit  norm:  ||<^n||  =  1  for  every  n  —  1, , . . ,  N.  Such 
frames  are  useful  in  applications;  for  example,  if  one  encodes  a  signal  x  using  frame 
coefficients  ( x ,  tpn)  and  transmits  these  coefficients  across  a  channel,  then  UNTFs  are 
optimally  robust  to  noise  [58]  and  one  erasure  [32] .  Intuitively,  this  optimality  comes 
from  the  fact  that  frame  elements  of  a  UNTF  are  particularly  well-distributed  in  the 
unit  sphere  [14].  Another  pleasant  feature  of  UNTFs  is  that  it  is  straightforward  to 
test  whether  a  given  frame  is  a  UNTF :  Letting  $  =  [<pi  •  •  •  denote  an  M  X  N 
matrix  whose  columns  are  the  frame  elements,  then  $  is  a  LTNTF  precisely  when 
each  of  the  following  occurs  simultaneously: 

(i)  the  rows  have  equal  norm 

(ii)  the  rows  are  orthogonal 

(iii)  the  columns  have  unit-norm 

(This  is  a  direct  consequence  of  the  tight  frame’s  reconstruction  formula  and  the 
fact  that  a  LINTF  has  unit-norm  frame  elements;  furthermore,  since  the  columns 
have  unit-norm,  it  is  not  difficult  to  see  that  the  rows  will  necessarily  have  norm 
a J N/M.)  In  addition  to  being  able  to  test  that  an  ensemble  is  a  UNTF,  various 
LINTFs  can  be  constructed  using  spectral  tetris  [30]  (though  such  frames  necessarily 
have  N  >  2 M),  and  every  UNTF  can  be  constructed  using  the  recent  theory  of 
eigensteps  [20,53].  Now  that  UNTFs  have  been  properly  introduced,  we  relate  them 
to  almost  injectivity  for  phase  retrieval: 

Theorem  3.7.  If  M  and  N  are  relatively  prime,  then  every  unit  norm  tight  frame 
$  =  {p>n)n=i  —  yields  almost  injective  intensity  measurements. 
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Proof.  Pick  a  nonempty  proper  subset  S  C  {1, . . . ,  N}.  By  Theorem  3.6,  it  suffices  to 
show  that  rank<f>5  +  rank  $50  >  M,  or  equivalently,  rank  $5$^.  +  rank  $50$ >  M. 
Note  that  since  <f>  is  a  unit  norm  tight  frame,  we  also  have 

and  so  $5$^  and  <f>5c$^c  are  simultaneously  diagonalizable,  i.e.,  there  exists  a  uni¬ 
tary  matrix  U  and  diagonal  matrices  D\  and  A) 2  such  that 

UDJT  +  UD2U*  =  $s$s  +  =  f  I. 

Conjugating  by  U*,  this  then  implies  that  D\  +  D2  =  Let  Li  C  {1, . . . ,  M} 
denote  the  diagonal  locations  of  the  nonzero  entries  in  D\ ,  and  L2  C  {1 , . . . ,  M} 
similarly  for  D2.  To  complete  the  proof,  we  need  to  show  that  \Li  \  +  \L2\  >  M  (since 
|Li|  +  |L2|  =  rank<f)5<f)g  +  rank $50 <f>^c ) .  Note  that  Li  U  L2  {1, . . . ,  M}  would 
imply  that  D\  +  D2  has  at  least  one  zero  in  its  diagonal,  contradicting  the  fact  that 
D 1  +  L>2  is  a  nonzero  multiple  of  the  identity;  as  such,  Li  U  L2  =  {1, . . .  ,M}  and 
|Li|  +  |L2|  >  M .  We  claim  that  this  inequality  is  strict  due  to  the  assumption  that  M 
and  N  are  relatively  prime.  To  see  this,  it  suffices  to  show  that  L\  D  L2  is  nonempty. 
Suppose  to  the  contrary  that  L\  and  L2  are  disjoint.  Then  since  Tfi  +  D2  =  ||/, 
every  nonzero  entry  in  Di  must  be  N/M.  Since  S  is  a  nonempty  proper  subset  of 
{l,...,iV},  this  means  that  there  exists  K  e  (0,  M)  such  that  Di  has  K  entries 
which  are  N/M  and  M  —  K  which  are  0.  Thus, 

|5|  =  Tr  [<f>*s$s]  =  Tr[$5^]  =  Tr  [UD.U*]  =  Tr  [DJ  =  K(N/M ), 

implying  that  N/M  =  \S\/K  with  K  M  and  \S\  7^  N.  Since  this  contradicts  the 
assumption  that  N/M  is  in  lowest  form,  we  have  the  desired  result.  □ 
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Figure  2:  The  simplex  in  M3.  Pointing  out  of  the  page  is  the  vector  -^=(1,1,1),  while  the  other  vectors  are  the 
three  permutations  of  — (1 ,  —1,  —1).  Together,  these  four  vectors  form  a  unit  norm  tight  frame,  and  since 
M  =  3  and  N  =  4  are  relatively  prime,  these  yield  almost  injective  intensity  measurements  in  accordance 
with  Theorem  3.7.  For  this  ensemble,  the  points  x  such  that  A^1  (A(x))  A  {±#}  are  contained  in  the  three 
coordinate  planes.  Above,  we  depict  the  intersection  between  these  planes  and  the  unit  sphere.  According 
to  Theorem  3.9,  performing  phase  retrieval  with  simplices  such  as  this  is  NP-hard. 


In  general,  whether  a  UNTF  $  yields  almost  injective  intensity  measurements 
is  determined  by  whether  it  is  orthogonally  partitionable :  is  orthogonally  partition- 
able  if  there  exists  a  partition  S  U  Sc  =  {1, . . . ,  N}  such  that  span(<hS')  is  orthogonal 
to  span(<l>,sc).  Specifically,  a  UNTF  yields  almost  injective  intensity  measurements 
precisely  when  it  is  not  orthogonally  partitionable.  Historically,  this  property  of 
UNTFs  has  been  pivotal  to  the  understanding  of  singularities  in  the  algebraic  va¬ 
riety  of  UNTFs  [47],  and  it  has  also  played  a  key  role  in  solutions  to  the  Paulsen 
problem  [16,29].  However,  it  is  not  clear  in  general  how  to  efficiently  test  for  this 
property;  this  is  why  Theorem  3.7  is  so  powerful. 


3.2  The  computational  complexity  of  phase  retrieval 

The  previous  section  characterized  the  real  ensembles  which  yield  almost  in¬ 
jective  intensity  measurements.  The  benefit  of  seeking  almost  injectivity  instead  of 
injectivity  is  that  we  can  get  away  with  much  smaller  ensembles.  For  example,  The¬ 
orem  3.7  implies  that  a  full  spark  ensemble  in  RM  of  size  M  +  1  suffices  for  almost 
injectivity,  while  2 M  —  1  measurements  are  required  for  injectivity  (Theorem  2.2). 
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In  this  section,  we  demonstrate  that  this  savings  in  the  number  of  measurements 
can  come  at  a  substantial  price  in  computational  requirements  for  phase  retrieval. 
In  particular,  we  consider  the  following  problem: 

Problem  3.8.  Let  T  =  {$m}m=2  be  a  fam^V  °f  ensembles  =  {lPM-,n}n=: — 
Rm ,  where  N(M)  =  poly (M).  Then  ConsistentIntensities  [J7]  is  the  following 
problem:  Given  M  >  2  and  a  rational  sequence  {bn}ni does  there  exist  x  G  MM 
such  that  |  (x,  < Pm-,u )  \  =  bn  for  every  n  —  1, . . . ,  N(M)  ? 

In  this  section,  we  will  evaluate  the  computational  complexity  of  the  problem 
ConsistentIntensities [J7]  for  a  large  class  of  families  of  small  ensembles  T ',  but 
first,  we  briefly  review  the  main  concepts  involved.  Complexity  theory  is  chiefly 
concerned  with  complexity  classes,  which  are  sets  of  problems  that  share  certain 
computational  requirements,  such  as  time  or  space.  For  example,  the  complexity 
class  P  is  the  set  of  problems  which  can  be  solved  in  an  amount  of  time  that  is 
bounded  by  some  polynomial  of  the  bit-length  of  the  input.  As  another  example, 
N  P  contains  all  problems  for  which  an  affirmative  answer  comes  with  a  certificate  that 
can  be  verified  in  polynomial  time;  note  that  P  C  NP  since  for  every  problem  A  G  P, 
one  may  ignore  the  certificate  and  find  the  affirmative  answer  in  polynomial  time. 
One  key  tool  that  is  used  to  evaluate  the  complexity  of  a  problem  is  called  polynomial¬ 
time  reduction.  This  is  a  polynomial-time  algorithm  that  solves  a  problem  A  by 
exploiting  an  oracle  which  solves  another  problem  B,  indicating  that  solving  A  is  no 
harder  than  solving  B  (up  to  polynomial  factors  in  time);  if  such  a  reduction  exists, 
we  write  A  <  B.  For  example,  any  efficient  phase  retrieval  procedure  for  T  can 
be  used  as  a  subroutine  to  solve  ConsistentIntensities  [J7],  indicating  that  phase 
retrieval  for  T  is  at  least  as  hard  as  ConsistentIntensities  [J7].  A  problem  B  is 
called  NP -hard  if  B  >  A  for  every  problem  A  G  NP.  Note  that  since  <  is  transitive, 
it  suffices  to  show  that  B  >  C  for  some  NP-hard  problem  C.  Finally,  a  problem  B  is 
called  NP -complete  if  B  G  NP  is  NP-hard;  intuitively,  NP-complcte  problems  are  the 
hardest  of  problems  in  NP.  It  is  an  open  problem  whether  P  =  NP,  but  inequality 
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is  widely  believed  [39];  note  that  under  this  assumption,  NP-hard  problems  have  no 
computationally  efficient  solution.  This  provides  a  proper  context  for  the  main  result 
of  this  section: 

Theorem  3.9.  Let  T  =  {&m}m=2  denote  a  family  of  full  spark  ensembles  = 
{<PM;n}n= i*  —  w ^  rational  entries  that  can  be  computed  in  polynomial  time. 
Then  ConsistentIntensities [W]  is  NP -complete. 

Note  that  since  the  ensembles  are  full  spark,  the  existence  of  a  solution 

to  the  phase  retrieval  problem  |(x,  (pM-,n)  \  =  bn  for  every  n  =  1, _ ,  M  +  1  implies 

uniqueness  by  Theorem  3.6.  Before  proving  this  theorem,  we  first  relate  it  to  a  pre¬ 
vious  hardness  result  from  [82],  Specifically,  this  result  can  be  restated  using  the  ter¬ 
minology  in  this  paper  as  follows:  There  exists  a  family  IF  =  {&m}m=2  °f  ensembles 
=  {<PM;n}n=\ i  —  \  each  °f  which  yielding  almost  injective  intensity  measure¬ 

ments,  such  that  ConsistentIntensities  [IF]  is  NP-complete.  Interestingly,  these 
are  the  smallest  possible  almost  injective  ensembles  in  the  complex  case,  and  we 
suspect  that  the  result  can  be  strengthened  to  the  obvious  analogy  of  Theorem  3.9: 

Conjecture  3.10.  Let  T  =  {<f>Af}M=2  a  famdy  of  ensembles  <I>m  =  {^M;n}n=i  C 
CAI  which  yield  almost  injective  intensity  measurements  and  have  complex  rational 
entries  that  can  be  computed  in  polynomial  time.  Then  ConsistentIntensities  [J7] 
is  HP -complete. 

To  prove  Theorem  3.9,  we  devise  a  polynomial-time  reduction  from  the  follow¬ 
ing  problem  which  is  well-known  to  be  N  P-complete  [68] : 

Problem  3.11  (SubsetSum).  Given  a  finite  collection  of  integers  A  and  an  integer 
z,  does  there  exist  a  subset  S  C  A  such  that  XLes a  =  z? 

Proof  of  Theorem  3.9.  We  first  show  that  ConsistentIntensities  [J7]  is  in  NP. 
Note  that  if  there  exists  an  x  G  MM  such  that  \(x,ipM-,n)\  =  bn  for  every  n  = 
1  +  1,  then  x  will  have  all  rational  entries.  Indeed,  v  :=  &*Mx  has  all  ra¬ 
tional  entries,  being  a  signed  version  of  {bn}^Ai,  and  so  x  =  is 
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also  rational.  Thus,  we  can  view  r  as  a  certificate  of  finite  bit-length,  and  for  each 
n  =  1, . . . ,  M  +  1,  we  know  that  |(x,  (pM-n)  \  =  bn  can  be  verified  in  time  which  is 
polynomial  in  this  bit-length,  as  desired. 

Now  we  show  that  ConsistentIntensities[J-']  is  NP-hard  by  reduction  from 
SubsetSum.  To  this  end,  take  a  finite  collection  of  integers  A  and  an  integer  z. 
Set  M  :=  |kl|  and  label  the  members  of  A  as  {am}^=1.  Let  T  denote  the  M  x  M 
matrix  whose  columns  are  the  first  M  members  of  Since  &m  is  full  spark,  T  is 
invertible  and  has  the  form  [I  w\ ,  where  w  has  all  nonzero  entries;  indeed,  if 

the  mth  entry  of  w  were  zero,  then  \  would  not  span,  violating  the  full 

spark  property  of  <fV-  Now  define 


bn  :=  < 


w. n 


M 


2*-E 


m= 1 


if  n 
if  n 


M  +  l. 


(24) 


We  claim  that  an  oracle  for  ConsistentIntensities^]  would  return  “yes”  from 
the  inputs  M  and  {bn}n=i  defined  above  if  and  only  if  there  exists  a  subset  SQ 
such  that  XLes  a  =  z >  which  would  complete  the  reduction. 

To  prove  this  claim,  we  start  with  (=^):  Suppose  there  exists  x  G  MM  such  that 

(x,  (pM;n)  |  =  bn  for  every  n  —  1, _ ,  M  +  l.  Then  y  :=  satisfies  |  (y,  \  = 

bn  for  every  n  —  1, . . . ,  M  +  1.  Since  —  [I  w],  then  by  (24),  the  entries  of  y 

satisfy 


,  _ 

1  _ 

Vm  =  1, . . . ,  M 


and 


M 


m=  1 


M 

2 Z  ^  ^  Am 

m= 1 
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By  the  first  equation  above,  there  exists  a  sequence  {em}^=1  of  ±l’s  such  that 
Vm  =  £mCLm/wm  for  every  m  —  1, . . . ,  M,  and  so  the  second  equation  above  gives 


M 

2 z  ^  ^  ttm 

m=  1 


M 

E 

ra=l 


Vm^ri 


M 

m=  1 


ra=l  m=l 

1  £m=  1 


M 


m=l 

£m  =  1 


m=l 


Removing  the  absolute  values,  this  means  the  left-hand  side  above  is  equal  to  the 
right-hand  side,  up  to  a  sign  factor.  At  this  point,  isolating  z  reveals  that  z  = 
XlmeSam’  where  S  is  either  {m  :  £rn  =  1}  or  {m  :  £m  =  —1},  depending  on  the  sign 
factor. 

For  (y=),  suppose  there  is  a  subset  S  C  {1, . . . ,  M}  such  that  z  =  J2m£Sam- 
Define  £m  1  when  m  e  S  and  £m  :=  —1  when  m  (jL  S.  Then 


M 


Sm&m 


m=  1 


m=  1  m=l 

£m  =  1  £m  —  1 


m= 1  m=  1 

£m  — 1 


M 

2^  ^  ^  dm 

m=  1 


By  the  analysis  from  the  (=^)  direction,  taking  ym  :=  £mam/wm  for  each  m  = 
1, . . . ,  M  then  ensures  that  |  (y,  =  bn  for  every  n  =  1, . . . ,  M  +  1,  which 

in  turn  ensures  that  x  :=  (T*)-1?/  satisfies  |(a;,  (fiM-n)  \  =  bn  for  every  n  =  1, . . . ,  M  + 

1.  □ 

Based  on  Theorem  3.9,  there  is  no  polynomial-time  algorithm  to  perform  phase 
retrieval  for  minimal  almost  injective  ensembles,  assuming  P  ^  NP.  On  the  other 
hand,  there  exist  ensembles  of  size  2 M  —  1  for  which  phase  retrieval  is  particularly 
efficient.  For  example,  letting  <*> M-m  £  denote  the  mth  identity  basis  element, 
consider  the  ensemble  ■=  {^M;m}m=iU{(W;i+^M;m}m=2>  then  one  can  reconstruct 
(up  to  global  phase)  any  x  whose  first  entry  is  nonzero  by  first  taking  £(1)  := 
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|(x,  5m;i)|,  and  then  taking 


x(m)  :=  — +  SM-,m)  I2  -  |  (xj  Sm;i)  | 2  -  |(ah(Wi)|2)  Vm  — 

Intuitively,  we  expect  a  redundancy  threshold  that  determines  whether  phase  re¬ 
trieval  can  be  efficient,  and  this  suggests  the  following  open  problem:  What  is  the 
smallest  C  for  which  there  exists  a  family  of  ensembles  of  size  N  =  CM +  o(M)  such 
that  phase  retrieval  can  be  performed  in  polynomial  time? 

We  now  consider  an  interesting  special  case  of  Problem  3.8  for  which  an  ap¬ 
proximate  solution  can  be  computed  in  polynomial  time:  Take  the  ensemble  $  e  Mm 
to  be  the  M  x  M  identity  matrix  with  the  all-ones  vector  as  its  (M  +  l)st  column. 
Then  |($*x)(n)|  =  bn  for  all  n  —  1, . . . ,  M  (abbreviated  |<f>*a;|  =  b ),  where 


bn  ■= 


\xn\ 


M 


m= 1 


if  n 
if  n 


M  +  l. 


With  this  notation,  we  introduce  Algorithm  2,  which  approximately  solves  the  phase 
retrieval  problem  for  the  intensity  measurements  |<f>*a;|  =  b.  To  discuss  the  perfor¬ 
mance  of  this  algorithm,  it  helps  to  consider  the  map  y/A,  defined  entrywise  by 
(V^4(x))(n)  :=  | {x,ipn)\.  This  mapping  is  actually  a  near-isometry  under  a  certain 
metric  and,  unlike  A ,  it  admits  desirable  performance  guarantees  (for  details,  see 
Section  4.1  of  this  paper). 


Lemma  3.12.  For  M  >  2,  let  <h  be  the  M  x  (M  +  1)  matrix  [I \w],  wn  =  1  for  all 
n  —  1,  •  •  • ,  M ,  and  take  c  =  e/2\[M  for  any  e  >  0.  Then  Algorithm  2  produces  an 
estimate  x  such  that  _  _ 

II VA(x)  -  VA(x)\\2 

- M - M -  —  £ 

\\xh 

after  0(M2'5e_1)  operations. 
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Algorithm  2  Approximate  |$*a;|  =  b  solver  for  <3>  =  [I |w],  wn  =  1  for  all  n  = 

!,•••,  M _ 

Input:  Rational  vector  b  of  length  M  +  1 
Output:  Approximate  solution  x  to  |<f>*a;|  =  b 

Fix  a  threshold  c  and  set  p  :  =  \  X^m=i 
Initialize  an  (M  +  1)  x  M  matrix  S  of  zeros 

for  j  =  1  to  M  do 

_  s  q  s  o  b  j 

b  —  c  t  {<5,  is  the  discrete  DiracA  function  at  j } 

Si  Si  +  SjW 1  J  J 

S  =  sort  {S}  {Sort  S  by  its  first  row} 

for  k  —  1  to  {number  of  columns  in  S'}  do 

Remove  the  kth  column  of  S  if  its  first  entry  is  greater  than  p  or  within  cp/M 
of  the  first  entry  in  the  (k  —  l)st  column 

end  for 
end  for 

if  First  entry  of  the  last  column  of  S  is  greater  than  or  equal  to  (1  —  c)p  then 
Define  em  =  (— l)a™  where  (ai,...,am)  are  the  remaining  entries  of  the  last 
column  of  S 
Output:  x  =  {£mbm)™=l 

else 

Ouput:  ’’INCONSISTENT” 

end  if _ 

Proof.  Suppose  x  e  Mm  is  the  estimate  produced  by  Algorithm  2.  Note  that  the 
algorithm  guarantees  the  first  M  entries  of  \CA(x)  are  identical  to  those  of  a /A(x), 
and  so  ||  vC4(^)  -  VA(x)  ||2  =  |  YZ.= i  xm\~\  E,li  %m\  ■  Since 

M+l  M+l  /  M  M 

P  =  2  bm  =  2  51  =  2  (  W  +  |  ^ 

m=  1  n=  1  \m=  1  m=  1 

we  see  that  |  Yhm= 1  £m|  =2 p  —  Ylm=i  \Xm\  =2 p  —  ||x||i.  Moreover,  we  have 

M 

^  ^  ^  ^  |  %  rn  |  ^  ^  I  --^  rn  |  1 1  &  \  |  1  2  ^  }  |  Xm  |  , 

kTl=  1  <0 
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and  so  an  application  of  the  triangle  inequality  yields 


\\y/A(x)  -  VA(x)\\2  = 


M 


M 


m=  1 


Xr 


m=  1 


X  1 


-  2  ^  l®m|  -  |2 p-  ||x||i| 


< 


XlTL  <-  0 

fill  -  2  \xm\ )  -  (2 p  -  i|z||i)  =  2 (p  - 

0  Xm  <0 


Since  the  algorithm  ensures  that  <o  \%m\  —  (1  —  C)P )  simplifies  to 


||v^A(£)  —  vC4(x)||2  <  2cp  <  2-\/M c||a;||2  =  £||x||2, 


which  rearranges  to  give  the  desired  bound.  To  complete  the  proof,  we  count  oper¬ 
ations.  The  majority  of  the  work  in  Algorithm  2  is  done  within  the  first  for  loop. 
In  fact,  the  remainder  of  the  algorithm  is  performed  in  O(M)  steps,  so  we  will  only 
focus  on  the  first  loop.  At  each  iteration,  the  number  of  operations  performed  on 
the  matrix  S  is  dependent  on  the  number  of  columns  since  for  each  column  we  add 
a  new  column  by  incorporating  the  next  entry  of  the  vector  b.  Due  to  the  trimming 
step  in  the  second  for  loop,  however,  we  limit  the  number  of  columns  kept  at  each 
iteration,  thereby  limiting  the  number  of  operations  performed.  Since  there  can  be 
no  more  than  p/(cp/M))  =  Mjc  columns  at  any  iteration,  an  upper  bound  on  the 
number  of  operations  is  M/c  +  0(M )  =  2 Ad2-5 /e  +  0{M )  =  0(M2-5e~1).  □ 

Lemma  3.12  shows  that  Algorithm  2  produces  an  estimate  whose  intensity 
measurements  approximate  the  true  intensity  measurements.  As  such,  one  can  ob¬ 
tain  an  approximate  solution  to  the  phase  retrieval  problem  for  the  ensemble  in 
polynomial  time  if  willing  to  work  with  the  estimated  intensity  measurements: 

Theorem  3.13.  For  M  >  2,  let  be  the  M  x  (M  +  1)  matrix  [I\w],  wn  —  1  for  all 
n  =  1, . . . ,  M,  and  suppose  an  estimate  x  for  x  G  MM  is  produced  by  Algorithm  2. 
Then  for  every  nonempty  proper  subset  S  C  {1,  2, . . , ,  M  +  1}  and  any  e  >  0, 

I  I  <  iHMh  and  x  =  ±x. 
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Proof.  We  seek  the  contrapositive.  First  recall  that  the  estimate  x  produced  by 
Algorithm  2  has  the  property  that  || \fA(x)  - \fA(x)  ||2  =  |  Y!m=\  %m\-\  Em=i  xm\  ■ 
Now  suppose  that  for  every  nonempty  proper  subset  S  C  {1,2, . . . ,  M  +  1}  and  any 
£■  >  0  we  have  \^2mesxm\  >  and  assume  x  ±x.  Then  there  exists  a 

nonempty  proper  subset  S  C  {1,  2, . . . ,  M  +  1}  such  that  xm  =  xm  for  all  m  G  S  and 
xm  =  —xm  for  all  m  G  Sc.  Thus, 


where  the  final  equality  follows  from  the  relation  |  |a|  —  \b\  |  =  min{|a  +  b\,  |a  —  6|}  for 
every  a,  b  e  M.  By  the  assumption  on  x  we  then  have 


|| VA(x)  -  VA(x) ||2 


2  min 


>  e  F  2 


i 


violating  the  result  of  Lemma  3.12.  Hence,  x  could  not  have  been  produced  by 
Algorithm  2,  as  desired.  □ 

To  be  clear,  in  the  average  case  it  is  still  highly  unlikely  that  the  true  signal 
x  is  recoverable  in  non-exponential  time  from  this  type  of  ensemble.  However,  this 
isn’t  too  surprising,  since  it  is  expected  that  the  smallest  constant  C  for  which  there 
exists  a  family  of  ensembles  of  size  N  =  CM  +  o(M)  such  that  phase  retrieval  can 
be  performed  in  polynomial  time  is  greater  than  one. 
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IV.  The  stability  of  phase  retrieval 

In  order  for  methods  of  phase  retrieval  to  be  useful  for  practical  applications,  they 
must  be  able  to  combat  noise.  At  the  very  least,  we  require  some  semblance  of 
continuity  in  the  intensity  measurement  mapping;  that  is,  if  a  signal’s  intensity 
measurements  are  perturbed  slightly  (e.g.,  by  noise  in  the  measurement  process), 
we  seek  a  bound  on  the  “closeness”  of  the  estimated  signal  to  the  true  signal.  This 
concept  is  known  as  stability,  and  it  is  the  focus  of  this  chapter.  Here,  we  analyze 
stability  in  phase  retrieval  for  both  the  worst  and  average  cases.  For  the  former,  we 
develop  a  new  condition  which  strengthens  the  complement  property  of  Section  2.1; 
for  the  latter,  we  use  a  stochastic  noise  model  to  develop  stronger  versions  of  the 
injectivity  characterizations  of  Chapter  II. 

4-1  Stability  in  the  worst  case 

As  far  as  applications  are  concerned,  the  stability  of  reconstruction  is  perhaps 
the  most  important  consideration.  To  date,  the  only  known  stability  results  come 
from  PhaseLift  [27],  the  polarization  method  [3],  and  a  very  recent  paper  of  Eldar 
and  Mendelson  [49].  This  last  paper  focuses  on  the  real  case,  and  analyzes  how  well 
subgaussian  random  measurement  vectors  distinguish  signals,  thereby  yielding  some 
notion  of  stability  which  is  independent  of  the  reconstruction  algorithm  used.  In 
particular,  given  independent  random  measurement  vectors  {yn}^=i  C  RM,  Eldar 
and  Mendelson  evaluated  measurement  separation  by  finding  a  constant  C  such  that 

\\A(x)  -  A(y)\\i  >  C\\x  -  y\\2\\x  +  y\\2  Vr,?/eRM,  (25) 

where  A:  RM  — >  is  the  intensity  measurement  process  defined  by  (M(a:))(n)  := 
|(x,  (pn)\2-  With  this,  we  can  say  that  if  A(x)  and  A(y )  are  close,  then  x  must 
be  close  to  either  ±y,  and  even  closer  for  larger  C.  By  the  contrapositive,  distant 
signals  will  not  be  confused  in  the  measurement  domain  because  A  does  a  good  job 
of  separating  them. 
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One  interesting  feature  of  (25)  is  that  increasing  the  lengths  of  the  measure¬ 
ment  vectors  {(pn}n=i  will  in  turn  increase  (7,  meaning  the  measurements  are  better 
separated.  As  such,  for  any  given  magnitude  of  noise,  one  can  simply  amplify  the 
measurement  process  so  as  to  drown  out  the  noise  and  ensure  stability.  However, 
such  amplification  could  be  rather  expensive,  and  so  this  motivates  a  different  notion 
of  stability — one  that  is  invariant  to  how  the  measurement  ensemble  is  scaled.  One 
approach  is  to  build  on  intuition  from  Lemma  2.8;  that  is,  a  super  analysis  operator 
is  intuitively  more  stable  if  its  null  space  is  distant  from  all  rank- 2  operators  simulta¬ 
neously;  since  this  null  space  is  invariant  to  how  the  measurement  vectors  are  scaled, 
this  is  one  prospective  (and  particularly  geometric)  notion  of  stability.  In  this  sec¬ 
tion,  we  will  focus  on  another  alternative.  Note  that  d (x,y)  :=  min{||x  —  y\\,  ||a;-|-?/||} 
defines  a  metric  on  Mm/{±1},  and  consider  the  following: 

Definition  4.1.  We  say  f:  MM/{±1}  — »  RN  is  C -stable  if  for  every  SNR  >  0, 
there  exists  an  estimator  g:  RN  — >■  MM/{±1}  such  that  for  every  nonzero  signal 
x  G  MM/{±1}  and  adversarial  noise  term  z  with  ||z||2  <  ||/(a;)||2/SNR,  the  relative 
error  in  reconstruction  satisfies 

d  (g(f(x)  +  z),x)  <  C 

INI  ~  VsM' 

According  to  this  definition,  /  is  more  stable  when  C  is  smaller.  Also,  because 
of  the  SNR  (signal-to-noise  ratio)  model,  /  is  C-stable  if  and  only  if  every  nonzero 
multiple  of  /  is  also  (7-stable.  Indeed,  taking  f  cf  for  some  nonzero  scalar  c,  then 
for  every  adversarial  noise  term  5  which  is  admissible  for  /  and  SNR,  we  have  that 
z  z/c  is  admissible  for  f{x)  and  SNR;  as  such,  /  inherits  /’ s  (7-stability  by  using 
the  estimator  g  defined  by  g(y)  :=  g(y/c).  Overall,  this  notion  of  stability  offers 
the  invariance  to  scaling  we  originally  desired.  With  this,  if  we  find  a  measurement 
process  /  which  is  (7-stable  with  minimal  (7,  at  that  point,  we  can  take  advantage  of 


noise  with  bounded  magnitude  by  amplifying  /  (and  thereby  effectively  increasing 
SNR)  until  the  relative  error  in  reconstruction  is  tolerable. 

Now  that  we  have  a  notion  of  stability,  we  provide  a  sufficient  condition: 

Theorem  4.2.  Suppose  f  is  bilipschitz,  that  is,  there  exist  constants  0  <  a  <  (3  <  oo 
such  that 


ad(x,y)  <  || f{x)  -  f(y) ||  </3d(x,y)  Mx,y  G 
///( 0)  =  0,  then  f  is  ^--stable. 

Proof.  Consider  the  projection  function  P :  defined  by 

P(y )  :=  argmin  \\y'  —  y\\  \/y  G  M.N . 

y'Srange(/) 

In  cases  where  the  minimizer  is  not  unique,  we  will  pick  one  of  them  to  be  P(y). 
For  P  to  be  well-defined,  we  claim  it  suffices  for  rang e(/)  to  be  closed.  Indeed, 
this  ensures  that  a  minimizer  always  exists;  since  0  G  range(/),  any  prospective 
minimizer  must  be  no  farther  from  y  than  0  is,  meaning  we  can  equivalently  minimize 
over  the  intersection  of  range(/)  and  the  closed  ball  of  radius  ||y||  centered  at  y\  this 
intersection  is  compact,  and  so  a  minimizer  necessarily  exists.  In  order  to  avoid 
using  the  axiom  of  choice,  we  also  want  a  systematic  method  of  breaking  ties  when 
the  minimizer  is  not  unique,  but  this  can  be  done  using  lexicographic  ideas  provided 
range(/)  is  closed. 

We  now  show  that  range(/)  is,  in  fact,  closed.  Pick  a  convergent  sequence 
Q  rang e(/).  This  sequence  is  necessarily  Cauchy,  which  means  the  corre¬ 
sponding  sequence  of  inverse  images  (in}“=1  C  Mm/{±1}  is  also  Cauchy  (using  the 
lower  Lipschitz  bound  a  >  0).  Arbitrarily  pick  a  representative  zn  G  MA/  for  each 
xn.  Then  is  bounded,  and  thus  has  a  subsequence  that  converges  to  some 

z  G  MM.  Denote  x  :=  {±z}  G  MM/{±1}.  Then  d{xn,x)  <  \\zn  —  z\\,  and  so  (x„}“=1 
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has  a  subsequence  which  converges  to  x.  Since  {xn}™=1  is  also  Cauchy,  we  therefore 
have  xn  — >  x.  Then  the  upper  Lipschitz  bound  /3  <  oo  gives  that  f(x)  e  rang e(/)  is 
the  limit  of  {yn)n=v 

Now  that  we  know  P  is  well-defined,  we  continue.  Since  a  >  0,  we  know  /  is 
injective,  and  so  we  can  take  g  :=  /_1  o  P.  In  fact,  a-1  is  a  Lipschitz  bound  for  /-1, 
implying 

d(s(/(i)  +z),x)  =  d(f~‘  (P(f(x)  +  z)) ,  f~‘  (f(x))j  <  a_1||F(/(x)  +  z)  -  /(x)||. 

(26) 

Furthermore,  the  triangle  inequality  and  the  definition  of  P  together  give 

II P(f(x)  +  z)  -  f(x) ||  <  || P(f(x)  +  z)-  (f(x)  +  z) ||  +  ||z|| 

<  II fix)  ~  (f(x)  +  z)\\  +  \\z\\  =  2\\z\\.  (27) 

Combining  (26)  and  (27)  then  gives 

d{g(f{x)  +  z),x)  <  9ft_iW  ^  2a-1  ||/(x)||  =  2a"1  \\f(x)  -  /(0)||  ^  2/3/a 
||a:||  ~  a  ||a;||  ~  a/SNR  \\x\\  VMR  ||x  -  0||  “  ^SNR’ 

as  desired.  □ 

Note  that  the  “project-and-invert”  estimator  we  used  to  demonstrate  stabil¬ 
ity  is  far  from  new.  For  example,  if  the  noise  were  modeled  as  Gaussian  random, 
then  project-and-invert  is  precisely  the  maximum  likelihood  estimator.  However, 
stochastic  noise  models  warrant  a  much  deeper  analysis,  since  in  this  regime  one  is 
often  concerned  with  the  bias  and  variance  of  estimates.  As  such,  we  will  investi¬ 
gate  these  issues  in  the  next  section.  Another  example  of  project-and-invert  is  the 
Moore-Penrose  pseudoinverse  of  an  N  x  M  matrix  A  of  rank  M.  Using  the  obvious 
reformulation  of  C-stable  in  this  linear  case,  it  can  be  shown  that  C  is  the  condition 
number  of  A,  meaning  a  and  f3  are  analogous  to  the  smallest  and  largest  singular 
values.  The  extra  factor  of  2  in  the  stability  constant  of  Theorem  4.2  is  an  artifact 
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of  the  nonlinear  setting:  For  the  sake  of  illustration,  suppose  rang e(/)  is  the  unit 
circle  and  f(x)  =  (—1, 0)  but  z  —  (1  +  e,  0);  then  P(f(x)  +  z)  —  (1, 0),  which  is  just 
shy  of  2||z||  away  from  f(x).  This  sort  of  behavior  is  not  exhibited  in  the  linear  case, 
in  which  range(/)  is  a  subspace. 

Having  established  the  sufficiency  of  bilipschitz  for  stability,  we  now  note  that 
A  is  not  bilipschitz.  In  fact,  more  generally,  A  fails  to  satisfy  any  Holder  condition. 
To  see  this,  pick  some  nonzero  measurement  vector  tpn  and  scalars  C  >  0  and  a  >  0. 
Then 

n  2 

X!  (l((C'+  l)VmVn')\2  ~  |(^n,¥V)|2) 

n'= 1 

C"*  \Wn\\a 

Furthermore,  ||v4(</?n)||  >  |(.A(</?n))(n)|  =  ||</?n||4  >  0,  while  -1  diverges  as 

C  — >  oo,  assuming  a  <  1;  when  a  >  1,  it  also  diverges  as  C  — >  0,  but  this  case  is 
not  interesting  for  infamous  reasons  [73]. 

All  is  not  lost,  however.  As  we  will  see,  with  this  notion  of  stability,  it  happens 
to  be  more  convenient  to  consider  the  map  v^A,  defined  entry  wise  by  (\/A(x))(n)  := 
|  (x,  tpn)  | .  Considering  Theorem  4.2,  we  are  chiefly  interested  in  the  optimal  constants 
0  <  a  <  /3  <  oo  for  which 

ad(x,y)  <  \\yfA{x)  —  VA(y)\\  <  f3d(x,y)  Vx, y  e  Mm/{±1}.  (28) 

In  particular,  Theorem  4.2  guarantees  more  stability  when  a  and  f3  are  closer  to¬ 
gether;  this  indicates  that  when  suitably  scaled,  we  want  \/~A  to  act  as  a  near¬ 
isometry,  despite  being  a  nonlinear  function.  The  following  lemma  gives  the  upper 
Lipschitz  constant: 

Lemma  4.3.  The  upper  Lipschitz  constant  for  \/~A  is  (3  =  || <4** || 2 . 


M(((C  +  l)v?n)  ~  A(yn)\\  _  1  / 

d((C+  l)Lpn}Lpn)a  ||CVn|K 
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Proof.  By  the  reverse  triangle  inequality,  we  have 


|a|  —  \b\ |  <  min  (|a  —  b\,  |a  +  6| }  Va,  &  6  1. 

Thus,  for  all  x,y  G  MM/{±1}, 

N 

|| VA(x)  -  VA{y) II2  =  1  Kx>  V’") I  -  I <3/>  Pn>  1 12 

n= 1 

JV  /  \  2 

<  (  min  1 1  (x  —  y,  (pn) \,\(x  +  y,  (pn)  |  |  j 

n=  1  '  ' 

<  min  1 11$* (a:  —  y)||2,  ||$*(a;  +  y)||2  j 

<  ||<J>*||l(d(a;,2/))2.  (29) 

Furthermore,  picking  a  nonzero  x  G  MM  such  that  ||$*a;||  =  ||$*||2||a;||  gives 
II VA(x)  -  v/d(0)||  =  ||v/d(a;)||  =  ||$*a;||  =  ||$*||2||o;||  =  ||$*||2  d(a;,  0), 

thereby  achieving  equality  in  (29).  □ 

The  lower  Lipschitz  bound  is  much  more  difficult  to  determine.  Our  approach 
to  analyzing  this  bound  is  based  on  the  following  definition: 

Definition  4.4.  We  say  an  M  x  N  matrix  $  satisfies  the  a-strong  complement 
property  (a-SCP)  if 


max  {Aml„(»s»a,Amln(ts.$J.)}  >a2 


for  every  S  C  {1, ... ,  N}. 

This  is  a  numerical  version  of  the  complement  property  discussed  earlier  (Sec¬ 
tion  2.1).  It  bears  some  resemblance  to  other  matrix  properties,  namely  combi¬ 
natorial  properties  regarding  the  conditioning  of  submatrices,  e.g.,  the  restricted 
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isometry  property  [23],  the  Kadison-Singer  problem  [31]  and  numerically  erasure- 
robust  frames  [51].  We  are  interested  in  a-SCP  because  it  is  very  related  to  the 
lower  Lipschitz  bound  in  (28): 

Theorem  4.5.  The  lower  Lipschitz  constant  for  \fA  satisfies 

cr  <  a  <  V2  cr, 

where  a  is  the  largest  scalar  for  which  has  the  a -strong  complement  property. 

Proof.  By  analogy  with  the  proof  of  Theorem  2.2,  we  start  by  proving  the  upper 
bound.  Pick  £  >  0  and  note  that  is  not  ( a  +  £)-SCP.  Then  there  exists  S  C 
{1, . . . ,  N}  such  that  both  Amin($s4>£)  <  {ape)2  and  Aniin((h.ycTgc)  <  (ape)2 .  This 
implies  that  there  exist  unit  (eigen)  vectors  u,v  G  MM  such  that  ||<3>g'u||  <  (cr  +  £)||«|| 
and  Unci’ll  <  (cr  +  £)||u||.  Taking  x  :=  u  +  v  and  y  u  —  v  then  gives 

N 

II VA{x)  -  VA{y)\\2  =  \\(u  +  v,(pn)\  -  \  (u-v,(pn}\\2 

n=  1 

=  \\(U  +  ViVn)\  ~  \(u~V,(pn}\\2 

n£S 

+  H  (U  +  ViVn)\-  \  {u~V,(pn)\\2 

n£Sc 

<4^!(u,^)i2+45:  ku,^)i2, 

iiGS  n£Sc 

where  the  last  step  follows  from  the  reverse  triangle  inequality.  Next,  we  apply  our 
assumptions  on  u  and  v: 

II VA(x)  -  VA(y) II2  <  4(||^U||2  +  ll^cull2) 

<  4(cr  +  £)2(||m||2  +  ||u||2) 

=  8(<r  +  e)2  min  {||u||2,  ||u||2}  =  2(cr  +  £)2(d(x,  y))  2 . 
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Thus,  a  <  V2(a  +  e)  for  all  £  >  0,  and  so«<  \f2cr. 

Next,  to  prove  the  lower  bound,  take  £  >  0  and  pick  x,y  e  MM/{±1}  such 

that 

(a  +  e)  d(x,  y)  >  \\VA(x)  -  VA(y) ||. 

We  will  show  that  <f>  is  not  (a  +  e)-SCP.  To  this  end,  pick 

S  :=  {n  :  sign (x,<pn)  =  -  sign (y,<pn)} 

and  define  u  :=  x  +  y  and  v  :=  x  —  y.  Then  the  definition  of  S  gives 

\\®*SUW2  =  +  (V’Vn)\2  =  \\(XiVn)\  -  \(y,(pn}\\2, 

n£S  n£S 

.  .  2 

and  similarly  ||<f>1jcu||2  =  X^neSc  iK^b^n)!  —  |(?/,  </?n)||  •  Adding  these  together  then 
gives 

N 

ll$sMll2  +  Ws-v\\2  =  |KX>^")I _  \(y^n)\\2 

n= 1 

=  l|vC4(x)  -  VA(y) ||2  <  ( a  +  e)2(d(x,y))\ 

implying  both  ||<f>1vu||  <  (a  +  £)||«||  and  ||$gcu||  <  (a  +  e)||u||.  Therefore,  <f>  is  not 
(' a  +  e)-SCP,  i.e.,  a  <  a  +  e  for  all  £  >  0,  which  in  turn  implies  the  desired  lower 
bound.  □ 

Note  that  all  of  this  analysis  specifically  treats  the  real  case;  indeed,  the  metric 
we  use  would  not  be  appropriate  in  the  complex  case.  However,  just  like  the  com¬ 
plement  property  is  necessary  for  injectivity  in  the  complex  case  (Theorem  2.6),  it 
is  suspected  that  the  strong  complement  property  is  necessary  for  stability  in  the 
complex  case,  but  we  have  no  proof  of  this. 
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As  an  example  of  how  to  apply  Theorem  4.5,  pick  M  and  N  to  both  be  even 
and  let  F  =  {fn}nezN  be  the  x  N  matrix  obtained  by  collecting  the  first  ff 
rows  of  the  N  x  N  discrete  Fourier  transform  matrix  with  entries  of  unit-modulus. 
Next,  take  <f>  =  {(pn}nez N  to  be  the  M  x  N  matrix  formed  by  stacking  the  real  and 
imaginary  parts  of  F  and  normalizing  the  resulting  columns  (i.e. ,  multiplying  by 
y/2/M).  Then  <f>  happens  to  be  a  self-localized  finite  frame  clue  to  the  rapid  decay 
in  coherence  between  columns.  To  be  explicit,  first  note  that 


|  {<Pn,  <Pn>)  I2  =  JfZ  I  (R«  fni  Re  fn')  +  (Im  fn ,  Im  fn')  P 


<  jjs  ( (Re  fn,  Re  fn')  +  (Im  Im  fn>) 


+  i  ( (Im  fn,  Re  fn,)  -  (Re  fn,  Im  /„/) 


=  MfnJn')\2, 


and  furthermore,  when  n  n\  the  geometric  sum  formula  gives 

sm2(Mn(n  —  n')/N)  1 

sin2(7r(n  —  n')/N)  ~  sin2(7r(n  —  n')/N) 

Taking  u  :=  <p o,  v  2  and  S  ■—  {n  :  ^  <  n  <  ^},  we  then  have 

4  N/2  _  AN 

M2'sin2(7r/4)  “  AT2  ’ 

and  similarly  for  .  As  such,  if  N  =  o(M2),  then  $  is  a-SCP  only  if  a  van¬ 

ishes,  meaning  phase  retrieval  with  $  necessarily  lacks  the  stability  guarantee  of 
Theorem  4.5.  As  a  rule  of  thumb,  self-localized  frames  fail  to  provide  stable  phase 
retrieval  for  this  very  reason;  just  as  we  cannot  stably  distinguish  between  P0  +  PN/2 
and  Po—Pn/2  in  this  case,  in  general,  signals  consisting  of  “distant”  components  bring 
similar  instability.  This  intuition  was  first  pointed  out  by  Irene  Waldspurger — here  it 


I,S'^  =  \\^*SUf  =  X!  K^0,^n)|2  < 


U 


n£S 


4 

M2 


E 

n£S 


Sill 


(■ nn/N ) 


< 


\(fn,fn')\2  = 


^  ^  g2nim(n—n')/N 
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is  simply  made  more  rigorous  with  the  notion  of  cr-SCP.  This  means  that  stable  phase 
retrieval  from  localized  measurements  must  either  use  prior  information  about  the 
signal  (e.g.,  connected  support)  or  additional  measurements;  indeed,  this  dichotomy 
has  already  made  its  mark  on  the  Fourier-based  phase  retrieval  literature  [50,65]. 

We  can  also  apply  the  strong  complement  property  to  show  that  certain  (ran¬ 
dom)  ensembles  produce  stable  measurements.  We  will  use  the  following  lemma, 
which  is  proved  in  the  proof  of  Lemma  4.1  in  [35]: 

Lemma  4.6.  Given  n  >  m>  2,  draw  a  real  mxn  matrix  G  of  independent  standard 
normal  entries.  Then 

(\  i  /  \  n—m+l 

WGG-)<g)<r(„_m  +  2)(l)  «>«• 

Theorem  4.7.  Draw  an  M  xN  matrix  <F  with  independent  standard  normal  entries, 
and  denote  R  —  ||.  Provided  R  >  2,  then  for  every  e  >  0,  $  has  the  a -strong 
complement  property  with 


1  N  -2M  +  2 

+e/{R-2)  '  2fl/(fl-2)v/jv’ 

with  probability  greater  than  or  equal  to  1  —  e~eM . 


Proof.  Fix  M  and  N,  and  consider  the  function  /:  (M  —  2,  oo)  — y  (0,  oo)  defined  by 


f(x)  ■= 


1 


r(x  -M  +  2) 


x—M+1 


To  simplify  the  analysis,  we  will  assume  that  N  is  even,  but  the  proof  can  be  amended 
to  account  for  the  odd  case.  Applying  Lemma  4.6,  we  have  for  every  subset  S  C 
{1,...,1V}  of  size  K  that  Pr(Amin(<l)5<F5)  <  a2)  <  f(K),  provided  K  >  M,  and 
similarly  Pr(Amin(<l>5c<FJc)  <  a2)  <  f(N  —  K ),  provided  N  —  K  >  M.  We  will  use 
this  to  bound  the  probability  that  is  not  cr-SCP.  Since  Amin(<l)5c<F^c)  =  0  whenever 


76 


| S' |  >  N  —  M  +  1  and  Amin  (fpqifrg)  <  Amm^T^r)  whenever  S  C  T,  a  union  bound 
gives 


Pr  is  not  cr-SCPj 

=  Pr  (3 S  C  {1, . . . ,  N}  s.t.  A min($s®*s)  <  and  Amin($5c$Jc)  <  a2) 

<  Pr  (3  S  C{1,...,N},\S\=N-M+1,  s.t.  Amin($s^)  <  a2) 

+  Pr  ^3S  C  {1, . . . ,  N},  M  <  \S\  <  N  —  M, 

s.t.  Amin($s$*s)  <  a2  and  Amin($,sc<f>gc)  <  cr2) 

2  U  _  M  + 1)  /(iv  -  "  +  D  +  2  E  (*)  / WM  -  K),  (30) 

V  /  K=M  v  7 


where  the  last  inequality  follows  in  part  from  the  fact  that  both  Amin(<f)s'<f>lj)  and 
Amin (<f,sc<h 5c)  are  independent  random  variables,  and  the  factor  |  is  an  artifact  of 
double  counting  partitions.  We  will  further  bound  each  term  in  (30)  to  get  a  simpler 
expression.  First,  >  2k  for  all  k  and  so 


f(N-M  +  1)  < 
< 


1 

T(N  -  2 M  +  3) 
1 

r(iv  -  2  M  +  3) 


(crVN)N~2M+2 

(<jVn)n~2M+2 


1  /TV  -  2 M  +  2 A 

2f-^+1Vf  -M  +  l  ) 


=  mr- 


Next,  we  will  find  that  g(x)  :=  f(x)f(N  —  x)  is  maximized  at  x  —  To  do  this,  we 
first  find  the  critical  points  of  g.  Since  0  =  g'{x)  =  f'{x)f{N  —  x)  —  f(x)f(N  —  x), 
we  have 


7- log  f(y) 

dy 


f\x)  /'(TV 


I  y=X  f(X )  f(N 

To  analyze  this  further,  we  take  another  derivative: 


(31) 


y=N—x 


P°*fM  =  k  +  ^~PoeT(y-M  +  2)-  (32) 
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It  is  straightforward  to  see  that 


1  M  -\  1  f°°  dt 

2 y+  2 y2  -  y  -  M  +  2  Jy_M+2  f2 


< 


oo 


£ 


1 

(y  -  M  +  2  +  k)2 


d 2 
dy 2 


logT(y  -  M  +  2), 


where  the  last  step  uses  a  series  expression  for  the  trigamma  function  d’i(z)  := 
-^2  logr(z);  see  Section  6.4  of  [1].  Applying  this  to  (32)  then  gives  that  log  f(y)  < 
0,  which  in  turn  implies  that  log  f{y)  is  strictly  decreasing  in  y.  Thus,  (31)  requires 
x  =  N  —  x,  and  so  s  =  f  is  the  only  critical  point  of  g.  Furthermore,  to  see  that 
this  is  a  maximizer,  notice  that 


/( 


=  2/(f 


=  2 /(f 


N \2 


N\2 


/(f)s 


d  f\y ) 


dy  f(y ) 


=  2/(f 


y=  f 


■^log/fe) 


<  0. 


To  summarize,  we  have  that  f(N  —  M  +  1)  and  f(K)f(N  —  K)  are  both  at  most 
/(y)2.  This  leads  to  the  following  bound  on  (30): 


Pr  ( is  not  u-SCp)  V(f  )2 

Z  K= 0  '  ' 


niV— 1 

_  0N-lf/N\2  _  z _ 

r(f-M  +  2)2rv  2) 


O' i  /  ^ 


N—2M+2 


Finally,  applying  the  fact  that  T(/c  +  1)  >  e(-)fc  gives 


2^-1  /  \/iV 

Pr  ( <h  is  not  cr-SCP^  <  — —  I  creV2  ■ 


/  e2 


N—2M+2 


N-2M  +  2/ 

2RM  (  o„o  0a(«-2)m+2 


2e2 

<  2RM  (e£2 


^e-s/{R-2)2-R/(R-2)y 


RMf„enR\-M  _  -eM 
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Figure  3:  The  graph  on  the  left  depicts  log10  b(R)  as  a  function  of  R ,  which  is  defined  in  (33).  Modulo  e  terms,  this 
serves  as  an  upper  bound  on  log10 (2 1 1 || 2 / cr)  with  high  probability  as  M  — >  00,  where  $  is  an  M  x  RM 
matrix  of  independent  standard  Gaussian  entries.  Based  on  Theorem  4.2  (along  with  Lemma  4.3  and 
Theorem  4.5),  this  provides  a  stability  guarantee  for  the  corresponding  measurement  process,  namely  vA. 
Since  log10  b(R)  exhibits  an  asymptote  at  R  =  2,  this  gives  no  stability  guarantee  for  measurement  ensem¬ 
bles  of  redundancy  2.  The  next  three  graphs  consider  the  special  cases  where  M  =  2,4,6,  respectively.  In 
each  case,  the  dashed  curve  depicts  the  slightly  stronger  upper  bound  of  log10  a(R,  M),  defined  in  (33). 
Also  depicted,  for  each  R  E  {2,  2.5,  3,  3.5,  4},  are  30  realizations  of  log10(2||4>*  ||2/cr);  we  provide  a  piece- 
wise  linear  graph  connecting  the  sample  averages  for  clarity.  Notice  that  as  M  increases,  log10  a(R->  M) 
approaches  log10  &(H);  this  is  easily  seen  by  their  definitions  in  (33).  More  interestingly,  the  random  real¬ 
izations  also  appear  to  be  approaching  log10  b(-R);  this  is  most  notable  with  the  realizations  corresponding 
to  R  =  2.  To  be  clear,  we  use  a  as  a  proxy  for  a  (see  Theorem  4.5)  because  a  is  particularly  difficult  to 
obtain;  as  such,  we  do  not  plot  realizations  of  log10(2/3/a). 


as  claimed.  □ 

Considering  ||$*||2  <  (1+s)(a /N+y/M)  with  probability  >  1— 2e_£(T^  +'/m)2/2 
(see  Theorem  11.13  of  [42]),  we  can  leverage  Theorem  4.7  to  determine  the  stability 
of  a  Gaussian  measurement  ensemble.  Specifically,  by  Theorem  4.2  (along  with 
Lemma  4.3  and  Theorem  4.5)  we  have  that  such  measurements  are  C-stable  with 


a(R,M) 


Figure  3  illustrates  these  bounds  along  with  different  realizations  of  2||<L*||2/or.  This 
suggests  that  the  redundancy  of  the  measurement  process  is  the  main  factor  that 
determines  stability  of  a  random  measurement  ensemble  (and  that  bounded  redun- 
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dancies  suffice  for  stability).  Furthermore,  the  project-and-invert  estimator  will  yield 
particularly  stable  signal  reconstruction,  although  it  is  not  obvious  how  to  efficiently 
implement  this  estimator;  this  is  one  advantage  given  by  the  reconstruction  algo¬ 
rithms  in  [3,27]. 

4-2  Stability  in  the  average  case 

Suppose  a  random  variable  Y  is  drawn  according  to  some  unknown  member 
of  a  parameterized  family  of  probability  density  functions  The  Fisher 

information  J(9)  quantifies  how  much  information  about  the  unknown  parameter  6 
is  given  by  the  random  variable  on  average.  This  is  particularly  useful  in  statistical 
signal  processing,  where  a  signal  measurement  is  corrupted  by  random  noise,  and 
the  original  signal  is  viewed  as  a  parameter  of  the  random  measurement’s  unknown 
probability  density  function;  as  such,  the  Fisher  information  quantifies  how  useful 
the  noisy  measurement  is  for  signal  estimation. 

In  this  section,  we  will  apply  the  theory  of  Fisher  information  to  evaluate 
the  stability  of  the  intensity  measurement  mapping  A.  To  do  this,  we  consider  a 
stochastic  noise  model,  that  is,  given  some  signal  x,  we  take  measurements  of  the 
form  Y  =  A(x)  +  Z,  where  the  entries  of  Z  are  independent  Gaussian  random 
variables  with  mean  0  and  variance  a2.  We  want  to  use  Y  to  estimate  x  up  to  a 
global  phase  factor;  to  simplify  the  analysis,  we  will  estimate  a  particular  9{x)  =  x, 
specifically  (and  arbitrarily)  x  divided  by  the  phase  of  its  last  nonzero  entry.  As 
such,  Y  is  a  random  vector  with  probability  density  function 

f(v 1 0)  =  Vj,  €  R"  (34) 

To  be  clear,  many  of  the  results  that  follow  are  consequences  of  the  fact  that  (34)  is 
a  member  of  the  exponential  family  of  distributions;  we  go  through  the  analysis  here 
since  the  relevant  literature  may  be  less  familiar  to  the  phase  retrieval  community. 
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With  the  probability  density  function  (34),  we  can  calculate  the  Fisher  infor¬ 
mation  matrix,  defined  entrywise  by 


M»))«  :=  E 


log  f(Y i »))  ( tog  f(Y-0) 


de 


dd. 


(35) 


In  particular,  we  have 


N 


N  n=l 


N 


n=  1  ' 


and  so  applying  (35)  along  with  the  independence  of  the  entries  of  Z  gives 


1  9 


d 


M*))«  W\^nW(mnmnZ, 

n=  1  n'= 1  *  3 

1 


a2  '  86 

n=  1 


It  remains  to  take  partial  derivatives  of  A(9),  but  this  calculation  depends  on  whether 
6  is  real  or  complex.  In  the  real  case,  we  have 


d  d  (  M  \2  /  M 

QQ  (Ad))n  =  QQ  y  Xj  drn(pn(m)J  =2  (  9m(pn(m)  )  c pn(i ). 


ZW 

m=  1 


Thus,  if  we  take  T(d)  to  be  the  MxN  matrix  whose  nth  column  is  (9,  ipn)(pn,  then  the 
Fisher  information  matrix  can  be  expressed  as  J{9)  =  4>T(d)T(0)*.  Interestingly, 
Theorem  2.2  implies  that  J(9)  is  necessarily  positive  definite  when  A  is  injective. 
To  see  this,  suppose  there  exists  9  e  fl  such  that  J{9)  has  a  nontrivial  null  space. 
Then  {(9,  (pn)(pn}^=1  does  not  span  Mm,  and  so  S'  =  {n  :  ( 9,ipn )  =  0}  breaks 
the  complement  property.  As  the  following  result  shows,  when  A  is  injective,  the 
conditioning  of  J{9)  lends  some  insight  into  stability: 
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Theorem  4.8.  For  x  G  RM ,  let  Y  =  A(x)  +  Z  denote  noisy  intensity  measurements 
with  Z  having  independent  A/"(0,  a2)  entries.  Furthermore,  define  the  parameter  6 
to  be  x  divided  by  the  sign  of  its  last  nonzero  entry;  let  C  Mm  denote  all  such  9. 
Then  for  any  unbiased  estimator  9{Y ')  of  9  in  ft  with  a  finite  M  x  M  covariance 
matrix  C(9),  we  have  C(9 )  —  J(9)~l  is  positive  semidefinite  whenever  9  e  int(D). 

This  result  was  first  given  by  Balan  (see  Theorem  4.1  in  [5]).  Note  that  the 
requirement  that  9  be  in  the  interior  of  fl  can  be  weakened  to  9  0  by  recognizing 

that  our  choice  for  9  (dividing  by  the  sign  of  the  last  nonzero  entry)  was  arbitrary. 
To  interpret  this  theorem,  note  that 

Tr[C(0)]  =  Tr[E[(d(U)  -  9){9(Y)  -  0)T]] 

=  E[TV[(e(r)-e)(»(r)-e)T]] 

=  E[TV[(0(Y)  -  6)t0(Y)  -  0)]]  =  E||»(Y)  -  6f, 

and  so  Theorem  4.8  and  the  linearity  of  the  trace  together  give  E||#(Y')  —  9\\ 2  = 
Ty[C(9)\  >  Tr[J(0)-1].  In  the  previous  section,  Definition  4.1  provided  a  notion 
of  worst-case  stability  based  on  the  existence  of  an  estimator  with  small  error.  By 
analogy,  Theorem  4.8  demonstrates  a  converse  of  sorts:  that  no  unbiased  estimator 
will  have  mean  squared  error  smaller  than  Tr[J(0)-1].  As  such,  a  stable  measure¬ 
ment  ensemble  might  minimize  supegQ  Tr[J(0)_1],  although  this  is  a  particularly 
cumbersome  objective  function  to  work  with.  More  interestingly,  Theorem  4.8  pro¬ 
vides  another  numerical  strengthening  of  the  complement  property  (analogous  to  the 
u-strong  complement  property  of  the  previous  section).  Unfortunately,  we  cannot 
make  a  more  rigorous  comparison  between  the  worst-  and  average-case  analyses  of 
stability;  indeed,  our  worst-case  analysis  exploited  the  fact  that  \[A  is  bilipschitz 
(which  A  is  not),  and  as  we  shall  see,  the  average-case  analysis  depends  on  A  being 
differentiable  (which  \[A  is  not). 
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To  calculate  the  information  matrix  in  the  complex  case,  we  first  express  our 
parameter  vector  in  real  coordinates:  9  =  (ffi  +  i9m+i,  $2  +  i^M+2,  •  ■  • ,  9m  +  i#2 m), 
that  is,  we  view  6  as  a  2M-dimensional  real  vector  by  concatenating  its  real  and 
imaginary  parts.  Next,  for  any  arbitrary  function  g :  M2M  — >  C,  the  product  rule 
gives 


Since  we  care  about  partial  derivatives  of  A{6),  we  take 


2Reg(»)^m- 

(36) 


M 

9 (9)  (9,  (pn)  t  i^M+m)^n(^)i 

m=  1 


and  so 


(pn(i)  if  i  <  M 
— i <pn(i  —  M )  if  i  >  M. 


(37) 


Combining  (36)  and  (37)  then  gives  the  following  expression  for  the  Fisher  informa¬ 
tion  matrix:  Take  T(0)  to  be  the  2 M  x  N  matrix  whose  nth  column  is  formed  by 
stacking  the  real  and  imaginary  parts  of  (9,(pn)(pn]  then  J{9)  =  A'k(0)'F(0)*. 


Lemma  4.9.  Take  J{9 )  to  be  the  (2 M  —  1)  x  (2 M  —  1)  matrix  that  comes  from 
removing  the  last  row  and  column  of  J (9) .  If  A  is  injective,  then  J (9)  is  positive 
definite  for  every  9  e  int(Q). 


Proof.  First,  we  note  that  J{9)  =  4^ T(0)'F(0)*  is  necessarily  positive  semidehnite, 
and  so 

inf  xtJ(9)x  =  inf  [x;  O]TJ(0)[x;  0]  >  inf  yTJ(9)y>  0. 

||a'||=l  ||^||=1  ||2/||=1 

As  such,  it  suffices  to  show  that  J(9)  is  invertible. 

To  this  end,  take  any  vector  x  in  the  null  space  of  J(9).  Then  defining  y  := 
[a:;  0]  G  M2M,  we  have  that  J{9)y  is  zero  in  all  but  (possibly)  the  2Mth  entry.  As 
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such,  0  =  (y,  J(6)y )  =  \\^(9)*y\\2,  meaning  y  is  orthogonal  to  the  columns  of  T(d). 
Since  A  is  injective,  Theorem  2.3  then  gives  that  y  =  aid  for  some  «6l.  But  since 
6  G  int(12),  we  have  9m  >  0,  and  so  the  2Mth  entry  of  i 6  is  necessarily  nonzero.  This 
means  a  =  0,  and  so  y  (and  thus  x)  is  trivial.  □ 

Theorem  4.10.  For  x  G  CM ,  let  Y  =  A(x)  +  Z  denote  noisy  intensity  measurements 
with  Z  having  independent  Af (0,  a2)  entries.  Furthermore,  define  the  parameter  9  to 
be  x  divided  by  the  phase  of  its  last  nonzero  entry,  and  view  9  as  a  vector  in  M2M  by 
concatenating  its  real  and  imaginary  parts;  let  hi  C  M2M  denote  all  such  9.  Then  for 
any  unbiased  estimator  9 (T)  of  9  in  12  with  a  finite  2Mx2M  covariance  matrix  C (9), 
the  last  row  and  column  of  C{9)  are  both  zero,  and  the  remaining  (2M  — 1)  x  (2M  —  1) 
submatrix  C\9 )  has  the  property  that  C{9 )  —  J(9)~l  is  positive  semidefinite  whenever 
9  G  int(12). 


Proof.  We  start  by  following  the  usual  proof  of  the  vector  parameter  Cramer-Rao 
lower  bound  (see  for  example  Appendix  3B  of  [69]).  Note  that  for  any  i,j  G 

{I,--.,  2  M}, 


((Hv)h-ei)9]oeffi'’e)Hv,e)dv 


tat  ^  df(V'’9)j  a 

r(%))j  m  dy-9j 


df{y\9) 

89, 


dy 


8 


8 


=  -qq  I  _(0{y))jf(y’,0)dy—0j-^-  I  _J(y,0)dy, 


89, 


where  the  second  equality  is  by  differentiation  under  the  integral  sign  (see  Lemma  A.l 
in  Appendix  A  for  details;  here,  we  use  the  fact  that  9  has  a  finite  covariance  matrix 
so  that  9j  has  a  finite  second  moment).  Next,  we  use  the  facts  that  9  is  unbiased 
and  f(-',9)  is  a  probability  density  function  (regardless  of  9)  to  get 


«%)t 


»,)  Wi  fMdy  =M 


1  if  i  =  j 
0  if  i  j 
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Thus,  letting  We  log  f(y,  9)  denote  the  column  vector  whose  ith  entry  is  dl°sJ^y'e'> ; 
we  have 

1=  I  (9(»)-9)(V»log f(y;t}))Tf(y;e)dy. 

J  mN 

Equivalently,  we  have  that  for  all  column  vectors  a,  b  G  M2M, 

aTb=  I  aT(%)  -  6)  (We  log  f{y,0))Tb  f(y;9)dy. 

J  mN 

Next,  we  apply  the  Cauchy-Schwarz  inequality  in  /-weighted  L 2  space  to  get 

(aTb)2  =  f  [  aT(9(y)  -  9)  (We  log  f(y,0))Tb  f(y,  9) dy) 

\Jrn  J 

<[  aT(9(y)  -9)(0(y)  -  9)T a  f(y\9)dy 

J  RN 

X  [  bT{Vy  log  f(y;  (?))  (Ve  log/(y;  0))Tb  f(y;  0)dy 

JRN 

=  ( aTC(9)a )  (bTJ(9)b), 

where  the  last  step  follows  from  pulling  vectors  out  of  integrals.  Taking  b  := 
[J(9)~la\  0],  where  a  is  the  first  2 M  —  1  entries  of  a,  this  then  implies 

(aT  J(9)~ldY  =  (aTby  <  (aTC(9)a)  (6T  J(9)b)  =  (aTC(d)a)  (aT  J(9)~la) .  (38) 

At  this  point,  we  note  that  since  the  last  (complex)  entry  of  9  G  is  necessarily 
positive,  then  as  a  2M-dimensional  real  vector,  the  last  entry  is  necessarily  zero, 
and  furthermore  every  unbiased  estimator  9  in  12  will  also  vanish  in  the  last  entry. 
It  follows  that  the  last  row  and  column  of  C(9)  are  both  zero.  Furthermore,  since 
J(9)~l  is  positive  definite  by  Lemma  4.9,  division  in  (38)  gives 

(< 5TJ(d)“1a )  <  (aTC(9)a)  =  (dTC(9)d), 

from  which  the  result  follows.  □ 
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V.  The  phase  error  problem  in  synthetic  aperture  radar 

Now  that  we’ve  developed  an  intuition  for  phase  retrieval,  we  return  to  the  phase 
error  problem  in  synthetic  aperture  radar.  We  begin  by  formally  deriving  phase 
errors  in  the  bistatic  setting,  at  which  point  we  relate  the  problem  to  a  certain 
interferometric  approach  to  phase  retrieval  [3].  This  motivates  the  use  of  graphs 
to  organize  the  given  SAR  data,  and  then  we  can  leverage  an  algorithm  known  as 
angular  synchronization  to  recover  the  phase  errors.  The  remainder  of  the  chapter  is 
dedicated  to  developing  this  algorithmic  approach  to  solving  the  phase  error  problem, 
and  we  conclude  with  simulations  that  illustrate  its  stability  to  noise. 

5.1  Synthetic  aperture  radar 

As  discussed  in  Chapter  I,  SAR  is  a  form  of  microwave  radar  that  uses  relative 
motion  between  a  source  and  scene  to  reconstruct  an  image  of  the  scene  with  finer 
resolution  than  possible  with  traditional  radar.  SAR  is  typically  implemented  in  a 
monostatic  setting,  usually  as  a  single  source  on  a  moving  platform  (e.g.,  an  aircraft 
or  satellite)  which  repeatedly  transmits  a  fixed  microwave  signal  to  the  target  scene 
and  records  the  resultant  reflected  signal.  As  a  result,  the  scene  is  imaged  at  various 
locations;  each  reflected  signal  provides  information  about  the  scene  from  a  different 
perspective,  allowing  for  finer  resolution  in  the  final  reconstruction.  In  this  section, 
we  will  consider  airborne  SAR,  where  the  radar  source  is  located  on  a  moving  aircraft. 
Since  the  speed  of  light  far  exceeds  that  of  the  aircraft,  the  transmitted  and  reflected 
signals  will  effectively  travel  to  and  from  the  aircraft  along  the  same  path. 

To  see  how  airborne  SAR  works,  consider  a  two-dimensional  scene 
D  \=  {x  =  (xi,  x2)t  G  M2  :  x\  +  xl  <  /32}, 

with  magnetic  reflectivity  described  by  the  function  p:  M2  — »  M;  the  reflectivity  is 
taken  to  be  zero  outside  of  the  region  D  for  simplicity.  The  assumption  that  the 
scene  is  two-dimensional  is  possible  only  if  elevations  within  the  scene  are  relatively 


constant  in  comparison  to  the  radius  /3;  here,  we  operate  under  this  assumption. 
Taking  the  position  of  the  aircraft  to  be  r  G  M2,  where  ||r||  ^  /3,  the  Pythagorean 
Theorem  yields 

||®  -  r||2  =  | (a;  -  r,  |^||)|2  +  \  (x  -  r,  ^)\2 
for  every  vector  x  G  D,  where  A  G  M2x2  is  the  rotation  matrix 

0  -1 

A  = 

1  0 

Since  r  and  Ar  are  orthogonal,  we  then  obtain 

|(x  -  r,  jfc}|  =  |(x.  -  <r,  jfc)|  =  |<x,  <  HIM  <  ft 

by  an  application  of  the  Cauchy-Schwarz  inequality,  and  so  we  have 

||®  -  r||2  <  |(x  -  r,  |^||)|2  +  /32. 

Noting  that  the  radius  f3  is  quite  small  relative  to  the  distance  ||r||,  this  allows 
the  approximation  ||a:  —  r||  «  |(x  —  r,z) |,  where  z  :=  — r/||r||  is  the  bearing  vector 
between  the  aircraft  and  the  scene;  that  is,  z  is  the  unit  vector  that  determines  the 
direction  from  the  aircraft  to  the  center  of  the  scene.  Essentially,  this  approximation 
makes  use  of  the  assumption  ||r||  3>  /3  to  conclude  that  arcs  of  constant  distance 
from  the  aircraft  intersect  the  scene  as  straight  lines  perpendicular  to  its  bearing 
vector  z\  such  lines  are  referred  to  as  lines  of  constant  range,  and  so  each  distance 
in  the  direction  of  the  bearing  vector  defines  a  unique  line  of  constant  range.  In 
most  applications  of  airborne  SAR,  this  approximation  is  nearly  sharp  [48],  and  so 
we  consider  ||a;  —  r||  ~  |(x  —  r,  z)\  for  every  x  G  D. 
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Suppose  the  aircraft  transmits  a  signal  /,  which  is  then  reflected  by  the  scene 
and  returned  to  the  aircraft  as  the  signal  g.  Making  the  reasonable  assumption  that 
the  speed  of  light  far  exceeds  that  of  the  aircraft,  we  know  that  /  and  g  will  travel 
the  same  path  to  and  from  any  point  x  G  D]  hence,  both  signals  will  travel  the  same 
distance,  namely  ||a;  —  r||.  As  a  result,  we  expect  the  received  signal  to  be  given  by 

—  ^(x  —  r,  z)) p(x)dxidx2- 

D  D 

More  generally,  if  the  signal  /  is  transmitted  at  a  position  r\,  then  the  received  signal 
at  any  point  r2  is 

git)  ~  JJ  f(t-l((x-  rl,z1)  +  (x-  r2,Z2))^p(x)dx1dx2,  (39) 

D 

where  z\  :=  —  ri/||ri||,  z2  :=  —  '^2 / 1 1 '^2 1 1 ,  and  it  is  assumed  that  ||ri||  3>  f3  and 
1 1  r 2 1 1  3>  (3,  i.e. ,  the  transmitter  and  receiver  are  both  sufficiently  far  from  the  scene. 

For  a  fixed  transmitter  and  receiver,  it  is  unclear  how  to  distinguish  two  points 
x,y  G  D,  x  7^  y,  given  the  signal  g  recorded  over  a  period  of  time.  The  problem, 
which  we  discuss  in  the  monostatic  setting  for  simplicity,  is  two-fold:  First,  points 
along  any  line  of  constant  range  are  necessarily  indiscernible,  since  their  radar  sig¬ 
natures  return  to  the  source  at  exactly  the  same  time,  ffence,  each  recorded  signal 
only  contains  information  about  the  integrated  reflectivity  function  along  each  line 
of  constant  range.  The  fix  for  this  is  the  so-called  synthetic  aperture,  which  is  the 
effective  distance  the  aircraft  travels  while  repeatedly  imaging  the  scene.  As  the 
aircraft  moves  across  the  synthetic  aperture,  the  lines  of  constant  range  rotate,  and 
so  each  recorded  signal  encodes  the  integrated  reflectivity  function  along  a  differ¬ 
ent  set  of  lines  through  the  scene  [48].  Hence,  wider  synthetic  apertures  yield  more 
information  about  lines  of  constant  range. 


9{t)  = 


—  ^\\x  —  r\\)  p(x)dxidx2 


The  second  problem  is  distinguishing  points  of  differing  distances  from  the 
aircraft.  Although  the  radar  signatures  of  such  points  return  to  the  source  at  different 
times,  the  relative  size  of  the  scene  (and  the  magnitude  of  the  speed  of  light)  cause 
them  to  overlap  if  the  transmission  duration  is  too  small.  Indeed,  if  the  signal  is 
of  constant  frequency,  then  it  is  impossible  to  tell  two  such  points  apart,  regardless 
of  the  duration  of  the  transmission.  On  the  other  hand,  if  the  transmitted  signal 
is  a  superposition  of  a  range  of  frequencies  (i.e.,  a  burst),  then  it  is  possible  to 
distinguish  positions  within  the  scene  according  to  time  of  arrival,  provided  the 
duration  of  transmission  is  large  enough.  However,  transmitting  a  burst  is  expensive 
and  impractical  due  to  power  limitations. 

To  avoid  this  issue,  a  commonly  used  technique  in  airborne  SAR  is  taking  the 
transmitted  signal  /  to  be  a  linear  chirp  [48].  That  is,  let  /(r)  :=  e2m(hVT2+WTf  with 
instantaneous  frequency 

/'(t)  27ri(uT  +  w)e2m^VT2+WT'> 

-  v  i - - -  =i)T  +  m 

27ri/(r)  2nie2ni('iVT2+wr', 

Here,  v  is  known  as  the  chirp  rate  and  w  the  base  frequency.  Notice  that  the  instanta¬ 
neous  frequency  of  a  chirp  is  linear  in  time;  this  enables  points  at  different  distances 
from  the  aircraft  to  be  distinguished  by  extending  the  transmission  duration  (essen¬ 
tially  amplifying  relative  distances)  with  a  low  power  requirement.  If  the  synthetic 
aperture  is  wide  enough  and  contains  enough  transmission  points,  the  resulting  col¬ 
lection  of  line  integrals  of  the  reflectivity  function  is  sufficient  to  distinguish  all  points 
within  the  scene  [48]. 

Under  the  appropriate  assumptions  on  the  chirp  rate  of  the  transmitted  signal, 
the  reflected  signal  has  a  convenient  form  in  terms  of  the  two-dimensional  Fourier 
transform  F :  L1(M2)  — *  L°°(M2)  defined  by 


(Fh)(p,  q)  =  /  /  h(x,y)e  2m('(j,,q^(x’y^ dxdy 


for  every  h  G  L1(M2).  One  way  in  which  this  result  is  formulated  is  by  viewing  each 
line  integral  of  the  reflectivity  function  as  a  one-dimensional  Fourier  transform  and 
applying  the  Projection-Slice  Theorem  (see  Section  2.3.2  in  [48]).  Alternatively,  one 
could  work  entirely  in  two-dimensions  using  linear  operators.  The  general  result, 
proven  by  the  latter  approach,  is  given  in  the  following: 

Fact  5.1.  Let  r\,r2  G  M2,  with  ||ri||  3>  fl  and  1 1 r*2 1 1  3>  /3,  and  f  be  a  chirp  with  chirp 
rate  v  and  base  frequency  w.  Furthermore,  suppose  that  the  signal  f  is  emitted  at 
position  r-i  and  the  reflected  signal  g  is  received  at  r2,  as  in  (39).  Ifv  <C  . 

then 

git )  «  f(t)e-27ri{vt+w)* (Hrill+IMI)(Fp)  (i(ut  +  w){z  1  +  z2))  . 


Before  proving  this,  we  require  some  definitions.  Consider  the  modulation 
and  translation  operators  E  and  T  defined  respectively  by  (. Eh)(t )  =  e2mth(t)  and 
( Th)(t )  =  hit  —  1)  for  any  signal  h:  M  — >  C.  These  two  operators  act  on  a  chirp 
/(t)  =  e2m(2v'r2+WT')  in  a  particular  way:  For  any  a  G  M, 


C Taf)(r ) 


_  yvr  —  a)  =  e27ri(§AT-a)2+«’(T-aO) 

_  ^27ri ^(^va2-\-wa)—avT-\-(^VT2-\-wr)^ 

_  ^27ri(^va2—wa) ^27ri(—avt)  /*/ 


—  e27ri(5  va?-wa)  tfl-av  f\tT\ 


Proof  of  Fact  5.1.  Recall  the  form  of  the  received  signal  (39).  Defining  the  parameter 
tx  :=  ^(( x  -  r1,  zf)  +  (x  -  r2,  z2)),  we  obtain 


g(t) 


D 


D 


f(t  —  tx)p(x)dx\dx2  —  J  J  (Ttx  f^j  (t)p(x)dxidx2 

D 

e2ni(lvtl-wtx)  ( t)P{x)dXldx2 

eni  vt*  e  e~2^vt)tx  f(t)p(x)dxldx2. 


D 
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Now  consider  the  signal 


—  2'K\Wtx  —2'K\(vt)ta 


g(t)  :=  J  I  e  - e 

D 


f  {t)  p(x)dx\dx2 


Then 


\a(t)  ~9(t)\  = 


1  -  enivt^)e~27riwtxe~2ni('vt'>tx  f(t)p(x)dxidx‘2 


D 


<  /  /  1 1  —  enivt*  \p(x)\dx1dx2 


D 


< 


|1  -enivt*  2dXldx2  1  1  //l"^'2 


D 


V 2  /  „  „  \  1/2 

D 


where  the  last  inequality  follows  from  Cauchy-Schwarz.  By  considering  a  Taylor 
series,  it  can  be  seen  that  |1  —  el9\ 2  =  2(1  —  cos(9))  <  62  for  any  9  €  M,  and  so  it 
follows  that 


\g(t)-g(t)\  < 


1/2 


Tivt2^)2  dxidx2 


To  bound  the  parameter  tx ,  first  note  that  the  Cauchy-Schwarz  inequality  yields 


tx  =  \{{x~  n,  zi)  +  (x  -  r 2,  z2)) 

=  \  (lN|  +  INI  -  (x,  -  (x,  i^}) 

—  c  (llrlll  +  llr2||  +  2||x||)  , 

and  so,  over  the  region  D,  we  obtain  0  <  tx  <  2(||ri||  +  1 1 r2 1 1 ) -  Thus, 
l»W  -  9WI  <  +  |N| )2)  W2)1/2  =  ^  ■  7T3/2||p||i»(||r1||  +  ||r2||)2. 
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,2 


■a,  this  bound  implies  that  the  signal  g 


Recalling  the  assumption  v  <C  /?(||ri n+Hr-aii) 
approximates  the  received  signal,  and  so  we  may  write 


git)  «  g(t)  =  f{t)  e  2m{vt+w)tx  p(x)dxidx2 


D 


Since  tx  =  -(||ri||  +  ||r2||  +  (x,z\  +  z2)),  it  follows  that 


g(t)  «  f(t)e~27Ti{vt+w)c (l|ri|l+l,r2|l)  /  /  e-2^t+^)(x,zi+z2>p(a.)da,ida. 


j(t)e-2^+«;)i(||n||+|k2||)(Fp)  +  +  ? 


completing  the  proof.  □ 

The  implication  of  Fact  5.1  is  that  the  received  signal  is  a  modulated  version  of 
the  original  transmitted  signal  times  the  Fourier  transform  of  the  reflectivity  function 
along  the  line  through  the  scene  which  bisects  the  lines  defined  by  the  bearing  vectors 
z i,  z2.  For  a  single  aircraft  with  position  r,  the  lemma  yields 

g(t)  =  (f  (vt  +  m)z) ,  (40) 

2 

provided  the  chirp  rate  of  /  is  chosen  such  that  v  <C  2g\\r\\  •  Thus,  as  the  aircraft  moves 
across  the  synthetic  aperture,  each  received  signal  encodes  the  Fourier  transform  of 
the  reflectivity  function  along  a  different  line  through  the  scene,  namely,  those  lines 
defined  by  the  bearing  vectors  of  the  transmission  locations.  Consequently,  the  image 
of  the  scene  can  be  reconstructed  if  enough  samples  are  taken  to  enable  the  entire 
Fourier  transform  to  be  built  via  interpolation.  This  is  how  SAR  is  implemented  in 
practice. 

The  issue  that  arises  with  this  implementation  is  the  presence  of  phase  error. 
This  phase  error  is  actually  a  consequence  of  relative  uncertainty  in  the  distance  ||r|| 
at  each  transmission  location,  which  directly  impacts  the  modulation  and  phase  fac- 
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tors  which  precede  the  Fourier  transform  in  (40).  Small  fluctuations  in  this  distance, 
which  are  relatively  common  dne  to  factors  such  as  aircraft  performance,  weather, 
wind,  and  pilot  skill,  result  in  noticeable  phase  differences  between  the  received  sig¬ 
nals  [18].  As  a  result,  each  line  of  the  Fourier  transform  of  the  reflectivity  function 
obtained  is  skewed  by  an  independent  modulation  and  phase  factor.  Since  estimating 
the  modulation  is  possible  by  conventional  methods  [48],  one  could  use  interpolation 
to  obtain  the  complete  Fourier  transform  of  the  reflectivity  function  (up  to  a  global 
phase  factor)  if  these  phase  factors  were  all  the  same,  which  would  then  enable  im¬ 
age  reconstruction.  Unfortunately,  the  uncertainties  in  target  distance  are  typically 
uncorrelated,  and  so  any  image  reconstruction  algorithm  must  first  clear  this  hurdle. 

In  practice,  modern  inertial  navigation  systems  are  capable  of  keeping  the 
uncertainty  in  the  distance  ||r||  small  enough  to  render  the  modulating  term  in  (40) 
negligible  [48].  The  remaining  phase  error,  however,  cannot  be  eliminated  in  this  way. 
Two  methods  are  used  to  deal  with  the  phase  error:  One  method  is  to  use  motion 
sensors  on  the  aircraft  to  detect  fluctuations  in  the  flight  path,  which  can  be  used  to 
compute  a  correction  to  ||r||  and  determine  the  phase  errors  directly  [63].  The  second 
method,  known  as  autofocus,  is  more  widely  used — it  estimates  the  phase  errors  from 
the  raw  data  under  a  priori  assumptions  on  the  image  model  [44],  Common  autofocus 
algorithms  include  phase  gradient  and  map  drift  [21,28];  of  these,  phase  gradient  is 
usually  preferred  since  the  phase  estimation  portion  of  the  algorithm  is  known  to  be 
optimal  in  a  maximum-likelihood  sense  [48]. 

As  an  alternative,  if  we  introduce  additional  information  to  the  data  set,  it  may 
be  possible  to  completely  determine  the  phase  errors  during  image  reconstruction. 
Suppose  we  insert  a  second  aircraft  into  the  SAR  imaging  process.  Let  the  two 
aircraft  have  positions  r\  and  r3,  with  bearing  vectors  Z\  and  y3,  respectively  (position 
r2  and  bearing  vector  will  be  introduced  later).  Suppose  aircraft  i  transmits  the 
chirp  /(t)  =  e2m(hVT~+WT)  with  chirp  rate  satisfying  v  <C  jpj  •  Then,  by 
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Fact  5.1,  we  expect  the  received  signal  at  aircraft  j  to  be  of  the  form 

9i^(t)  :=  f(t)e~2^vt+w^ »’-*»+»^»)(Fp)  (>t  +  w)(Zi  +  Zj))  ;  (41) 

when  i  ^  j,  this  type  of  signature  is  characteristic  of  bistatic  radar. 

Consider  the  three  received  signals  #1-5.1,  #3-5.3,  and  #1-5.3  obtained  from  (41). 
If  we  combine  these  signals  in  a  particular  way,  we  can  eliminate  the  modulations 
and  phase  factors: 

#i-n(t)#i->3(t)  2#3->3(^) 

=  f(t)e-2^vt+w)-^ri\Fp)  (l(vt  +  w)z  1) 

x  ^/(t)e~27rl^+“’)c(llrill+llr3ll)(^p)  (1  (vt  +  w)(zi  +  £3)) 

x  f(t)e~2^vt+w^r3^Fp)  ( l(vt  +  w)z3 ) 

=  \f(t)\4(Fp)  (>t  +  w)z1)  ((Fp)  (l(vt  +  w)(z  1  +  ^3)))  (Fp)  ( l(vt  +  w)z3 )  . 

The  modulation  and  phase  factors  completely  cancel,  and  what  remains  is  simply 
the  magnitude  of  the  original  signal  times  a  product  of  Fourier  transforms  of  the 
reflectivity  function  along  lines  through  the  scene  defined  by  the  unit  vectors  Zl,  z3 
and  their  bisector  zo  '■=  »Zl~[Z3»  ■  This  interferometric  effect  bears  some  resemblance 
to  that  leveraged  by  Alexeev,  Bandeira,  Fickus  and  Mixon  in  [3]  to  solve  the  phase 
retrieval  problem. 

Before  exploring  this  connection,  it  is  important  to  note  that  the  Fourier  trans¬ 
forms  of  the  reflectivity  function  in  the  combination  above  are  taken  along  lines 
through  the  scene  that  are  scaled  differently  (since  z  1  +  z 3  is  not  twice  a  unit  vec¬ 
tor).  Hence,  the  parameterization  of  this  line  differs  from  those  of  the  other  lines, 
making  them  incompatible  in  Cartesian  coordinates.  Theoretically,  this  is  not  an 
issue;  indeed,  we  may  still  obtain  the  Fourier  transform  of  the  reflectivity  function 
via  interpolation  in  polar  coordinates.  However,  there  is  no  fast  algorithm  for  taking 
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the  inverse  Fourier  transform  of  the  result.  In  order  to  take  advantage  of  the  fast 
Fourier  transform  (FFT),  one  must  first  convert  the  data  to  rectangular  form,  and  so 
this  scaling  issue  needs  to  be  addressed.  Monostatic  SAR  also  encounters  this  prob¬ 
lem,  which  can  be  solved  by  a  process  called  polar-to-rectangular  resampling  [48]. 
Essentially,  this  process  enables  the  data  along  any  line  of  the  Fourier  transform  of 
the  reflectivity  function  to  be  properly  scaled  without  influencing  the  integrity  of  the 
data.  Hence,  knowing  Fp  along  a  line  through  the  scene  with  any  scaling  factor  is 
enough  to  determine  Fp  along  that  same  line  with  any  other  scaling  factor.  Denoting 
the  ith  slice  of  the  Fourier  transform  Fp  by 

hi(t)  :=  (Fp)  (l(vt  +  w)zi)  ,  (42) 


this  allows  the  above  expression  to  be  rearranged: 

hi(t)h2(t)2h3(t)  =  ^K(t)j  g1^1(t)g1^3(t)2g3^3(t),  (43) 


where  K(t)  denotes  the  quantity 


m  :=  m\> 

(*»  (^+w)(ieKii)) 


|/(t)|2(FP)  Q(^  +  W)(p  +  -3)) 
h2(t) 


Note  that  K(t)  can  be  calculated  by  first  estimating  (. Fp)(-(vt  +  w)(z3  +  z3))  up 
to  a  global  phase  factor  (by  removing  the  modulation  from  g\^3  using  conventional 
techniques  [48])  and  then  using  this  to  estimate  h2(t)  up  to  the  same  phase  factor 
by  dilation  and  translation.  Hence,  there  is  no  ambiguity  in  the  phase  of  K(t). 

At  this  point,  it  is  useful  to  identify  that  each  slice  hi  of  Fp  can  be  obtained 
from  the  corresponding  monostatic  signal 


=  f(t)e~2^vt+w^hi(t). 
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Figure  4:  (a)  A  triple  of  aircraft  positions  (ri,  r2,  ^3)  in  a  multistatic  SAR  scheme,  (b)  Using  conventional  techniques, 
the  monostatic  signals  gi-^i  associated  with  each  bearing  vector  Zi  produce  the  depicted  slices  hi  of  the 
Fourier  transform  of  the  reflectivity  function  of  the  scene,  up  to  distinct  global  phase  factors  Ui.  Meanwhile, 
the  bistatic  signal  <71— >.3  transmitted  at  r\  and  received  at  r%  produces  a  dilated  and  translated  version 
of  the  slice  h-2.  By  Fact  5.2,  this  additional  signal  combines  with  the  monostatic  signals  to  determine  the 
product  aqa;^-2^  of  unknown  phases. 


Again  by  conventional  techniques  [48],  we  can  estimate  hi(t)  from  #j_>j(t)  up  to 
a  global  phase  factor  a;*.  Thus,  we  get  the  estimates  hjft)  :=  u>ihi(t)  from  the 
monostatic  signal  at  position  r*.  Expressing  the  received  signals  #1-5.1,  #3-5.3,  and 
#1-5.3  in  terms  of  these  estimates,  (43)  implies 


CU1UI2  2  CO  3 


hi{t)h2(t)  2h3(t ) 
h1(t)h2(t)  2h3(t)  ’ 


(44) 


and  so  the  bistatic  signal  #1-5.3  (when  combined  with  #1^1,  #2-5-2,  and  #3-5.3)  enables 
one  to  recover  a  product  of  the  phase  factors  of  h\ ,  /i2,  and  h3  (see  Figure  4  for  an 
illustration).  We  summarize  this  situation  in  the  following  fact: 


Fact  5.2.  Pick  ri,  r2,  r3  G  M2  such  that  ri  +r3  is  a  positive  scalar  multiple  of  r2 ,  and 
let  f  be  a  chirp  with  chirp  rate  v  and  base  frequency  w  satisfying  the  hypotheses  of 
Fact  5.1.  Suppose  we  obtain  the  signals  #1-5.1,  #2-5.2,  #3-5-3  as  defined  in  (41).  Then 
conventional  techniques  [f8]  yield  the  estimates  hi(t )  =  c Oihi(t)  for  each  i  =  1,2,3. 
Here,  h^t)  denotes  the  ith  slice  of  the  desired  Fourier  transform  (42),  and  u is 
an  unknown  phase  factor.  Furthermore,  if  we  obtain  #1-5.3,  then  combining  with  the 
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other  signals  according  to  (43)  and  (44)  determines  the  product  coiCo2  2uj3  °f  unknown 
phase  factors. 

Since  the  magnitudes  of  the  Fourier  transform  slices  hi  can  be  obtained  from 
the  corresponding  monostatic  signals,  determining  these  slices  up  to  a  single  global 
phase  requires  determining  the  phase  errors  (Uj  from  products  of  the  form  (44).  No¬ 
tice  that  these  products  can  be  expressed  in  terms  of  relative  phases:  ujiujf2uj3  = 
1)_1.  Defining  the  relative  phases  a i)2  :=  uiuf1  and  02,3  :=  1,  we 

see  that  this  quantity  is  itself  a  relative  phase  of  relative  phases:  u\uf2u^  = 

If  this  bistatic  process  is  implemented  while  the  two  aircraft  move  across  the  syn¬ 
thetic  aperture,  it  is  possible  to  record  such  a  quotient  of  relative  phases  for  each 
triple  ( Ti,Tj,rk )  of  locations  realized  by  their  flight  paths.  Extracting  the  individ¬ 
ual  phase  errors  from  this  collection  of  products  is  then  possible  via  an  algorithm 
used  by  Bandeira  et  al.  in  [3]  to  solve  the  phase  retrieval  problem,  namely,  angular 
synchronization. 

5.2  Angular  synchronization 

The  angular  synchronization  algorithm,  as  first  introduced  by  Singer  in  [84], 
estimates  a  set  of  unknown  phases  using  (noisy)  measurements  of  relative  phase. 
The  idea  is  to  organize  the  phase  and  relative  phase  information  using  a  graph 
G,  from  which  certain  spectral  methods  enable  the  desired  estimation  with  little 
computational  burden.  Furthermore,  the  algorithm  is  provably  stable,  provided  the 
graph  G  is  “nice”  enough.  Here,  the  proper  notion  of  “nice”  is  in  terms  of  the 
connectivity  of  G. 

To  set  the  stage,  suppose  we  want  to  estimate  a  vector  of  phases  u  :=  {oy}*ey  C 
C  for  some  finite  set  V  given  a  set  of  relative  phase  measurements 
where  E  C  V  x  V.  Consider  the  simple  graph  G  =  (V,  E)  so  that  each  vertex  in  V 
represents  an  unknown  phase  and  each  edge  in  G  a  relative  phase  measurement.  In 
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particular,  i  G  V  represents  the  phase  ay,  while  (i,  j)  G  E  if  and  only  if  i,j  G  V  and 
we  have  the  measurement  tOiCoJ1. 

Notice  that  in  the  noiseless  case,  it  is  possible  to  recover  the  vector  u j  (up  to  a 
global  phase)  if  and  only  if  G  is  connected  [84],  Indeed,  in  this  case  one  may  choose 
a  spanning  tree  T  of  G  and  compute  each  phase  uy  by  propagating  from  vertex  to 
vertex  using  edges  in  T.  To  be  clear,  by  propagate  we  mean  multiply  the  phase  at 
vertex  i  by  the  relative  phase  to  obtain  the  phase  at  vertex  j.  The  global 

phase  ambiguity  arises  from  the  choice  of  starting  phase,  i.e.,  the  phase  assigned  to 
the  root  of  T  [84],  Unfortunately,  this  approach  does  not  work  in  the  noisy  case; 
in  fact,  this  method  actually  compounds  the  noise  by  adding  another  noise  term  at 
each  vertex. 

Instead  of  propagating  the  relative  phases  along  only  the  edges  in  T,  suppose 
that  we  propagate  along  every  edge  in  E.  If  there  are  no  cycles  in  G,  then  this 
exhibits  the  same  problem  as  above.  However,  cycles  provide  a  means  of  “noise 
cancellation.”  To  see  this,  suppose  C  is  a  cycle  in  G.  Choosing  two  vertices  i ,  j  G  C, 
there  are  two  paths  in  C  from  i  to  j.  Thus,  we  can  choose  to  propagate  the  phase  at 
vertex  i  along  either  of  these  paths  to  obtain  the  phase  at  vertex  j,  knowing  that  we 
expect  to  obtain  the  same  result  regardless  of  which  path  we  choose.  Due  to  noise,  it 
is  unlikely  that  one  will  obtain  the  same  result  for  both  paths,  but  the  two  differing 
phases  at  vertex  j  provide  a  means  for  comparison.  In  particular,  if  we  take  j  —  i, 
then  the  sum  of  the  phase  errors  along  C  must  be  zero  modulo  2i r;  this  cancellation 
provides  a  means  of  combating  the  noise.  Hence,  the  more  cycles  there  are  in  G,  the 
more  such  comparisons  one  can  make.  This  observation  suggests  that  graphs  with 
many  cycles  are  desirable,  a  quality  which  is  present  in  highly  connected  graphs.  This 
idea  is  what  led  Bandeira  et  al.  to  exploit  an  (optimally  connected)  expander  graph 
to  solve  the  phase  retrieval  problem  using  0(M  log  M)  intensity  measurements  [3]. 

What  follows  is  a  more  in-depth  discussion  of  angular  synchronization,  but 
first  we  require  some  definitions  from  spectral  graph  theory.  For  a  simple,  connected 


graph  with  n  vertices,  let  A  denote  the  adjacency  matrix  and  D  the  diagonal  matrix 
of  vertex  degrees,  {dj}”=1  C  N.  The  Laplacian  of  the  graph  is  defined  to  be  the 
matrix  L  I  —  D~1^2AD^1^2,  which  is  positive  semidefinite  since  for  any  x  €  Cn 
we  have 


x*  Lx 


EE 


x*A[i,j]xj 

yj didj 


>  N 


=  ° 

i= 1  3= 1 


with  equality  when  x  is  the  vector  whose  entries  are  all  1;  here,  the  inequality  follows 
from  the  fact  that  didj  >  1  whenever  A[i,j]  =  1.  Hence,  the  spectrum  of  L  satisfies 
0  =  Ai  <  •  •  •  <  Xn.  The  second  eigenvalue  A2  of  the  Laplacian  is  known  as  the 
spectral  gap  of  the  graph.  As  discussed  in  [3],  a  highly  connected  graph  necessarily 
has  a  large  spectral  gap. 

Returning  to  the  graph  G  =  (V,  E)  defined  above,  let  £  :=  denote 

a  vector  of  adversarial  noise  terms.  Since  we  seek  to  propagate  phase  along  edges, 
note  that  the  noisy  relative  phase  coitoj1  +  £ij  is  associated  with  a  direction,  namely, 
propagation  from  vertex  %  to  vertex  j .  For  the  reverse  direction,  we  take  the  noisy 
relative  phase  ujjU^1  +  £]t  =  uJiLoJ1  +£ij.  Let  Ai  denote  the  weighted  adjacency 
matrix  of  G,  defined  entrywise  by 


Mlj\ 


1  +  £ij 
UiUJj  1  +  £ij\ 


(45) 


Note  that  A\  is  self-adjoint  and  each  entry  A]  [i,  j]  is  an  approximation  of  the  relative 
phase  LOiLuJ1.  Considering  this,  it  makes  sense  that  we  may  obtain  the  vertex  phases 
{c Oi}i<zy  from  A1  by  solving  the  minimization  problem 


Co  =  arg  min  E  \ui  -  Ai[i,  j]ujj\2  .  (46) 

WihevQ T  ^eE 

One  method  of  (approximately)  solving  this  problem  is  angular  synchronization , 
which  is  summarized  in  Algorithm  3  (cf.  [3]). 
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Algorithm  3  Angular  synchronization 

Input:  Graph  G  =  (V,  E ),  noisy  relative  phases  cOiUj1  +  for  every  (i,j)  G  E 
Output:  Vector  of  phases  {u>i}i£v 

Let  Ai  denote  the  weighted  adjacency  matrix  of  G ,  defined  entry  wise  in  (45) 

Let  D  denote  the  diagonal  matrix  of  vertex  degrees  {dj}i£v 
Compute  the  matrix  L\  4—  I  —  D~x!2  A\D~xi2 

Compute  the  eigenvector  u  corresponding  to  the  smallest  eigenvalue  of  Li 
Ouput  Ui  =  Ui/\ui\  for  every  i  G  V 

The  matrix  L\  :=  I  —  D”1!2  A\D~X!2  in  Algorithm  3  is  known  as  the  connection 
Laplacian  of  G,  and  bears  resemblance  to  the  Laplacian  L.  Note  that  Algorithm  3 
suggests  that  the  solution  Cj  to  (46)  is  (approximately)  the  vector  whose  entries  are 
normalized  versions  of  the  entries  of  the  eigenvector  corresponding  to  the  smallest 
eigenvalue.  To  see  this,  we  will  need  the  help  of  the  following  elementary  result  from 
graph  theory: 

Lemma  5.3  (Degree-Sum  Formula).  Let  G  =  (V,  E)  be  a  simple  graph  and  denote 
by  di  the  degree  of  vertex  i  G  V.  Then  ^2ieVdi  =  2| E\. 

Proof.  Consider  the  subset  of  S  C  V  x  E  consisting  of  all  pairs  of  vertices  and  their 
incident  edges,  namely  S  :=  {(i,  (j,  k))  :  i,  j,  k  G  V,  j  =  i  or  k  =  i).  Since  the  degree 
of  a  vertex  counts  its  incident  edges,  every  j  G  h  contributes  di  elements  to  S.  On 
the  other  hand,  every  edge  (i,j)  G  E  determines  exactly  two  elements  of  S,  namely 
(i,  (i,j))  and  (j,  ( i,j )).  Hence,  we  have  di  =  151  =  2| E\,  as  desired.  □ 

We  now  provide  some  intuition  behind  the  use  of  eigenvectors  in  Algorithm  3. 
Expanding  the  sum  in  (46),  we  see  that 

\ui-A1[i,j]uj\2=  Y  {\UJi\2-2Re(UJi1Ai[i,j]^j)  +  \Al[i,j]ujj\2^ 

( i,j)eE 

=  2  Y  (i-Re^Ai^j]^)). 

(i,j)eE 
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By  Lemma  5.3,  the  first  part  of  this  expression  becomes 


2\E\  =  di  =  ooi  1diu>i  =  ui*  Du.  (47) 

i&V  i&V 

Meanwhile,  the  second  term  yields 

Re(2  =  uj*A1uj, 

(i,j)£E  i&V  j€V 

and  so  (46)  may  be  written  Co  =  argminweT|vi  u*(D  —  A\)u.  Noting  from  (47)  that 
u*Du  does  not  vary  with  to,  this  is  equivalent  to  solving 

u*(D  -  AAu 

arg  mm - , 

weTivi  u*Du 

which,  since  D  —  Ai  is  self-adjoint,  is  a  Rayleigh  quotient  in  terms  of  the  connection 
Laplacian  of  G: 

u*(D  —  Ai)u  _  (Dl/2u)*(I  —  D-1/2A1D~1/2)(D1/2uj)  _  {Dl/2u)* L^D^u) 
u*Du  “  \\Dl!2u\\2  “  \\D^u\\2  ' 

The  minimum  value  of  this  Rayleigh  quotient  over  all  u  G  C^v  I  is  the  smallest 
eigenvalue  of  L\,  attained  when  u  :=  D 1^2u  is  the  corresponding  eigenvector.  Since 
the  ith  coordinate  of  D1l2u  is  \fdlull  it  follows  that  u>i  =  vn / \fdl,  is  the  optimal 
choice  for  the  relaxed  form  of  (46).  This  does  not  necessarily  have  unit-modulus 
entries,  however,  and  so  we  normalize:  Col  :=  Ui/\ui\. 

To  be  clear,  since  Algorithm  3  produces  an  estimate  Co  for  the  vector  of  unknown 
phases  using  only  relative  phases,  any  estimate  el6Co,  6  e  M,  is  also  a  viable  estimate; 
this  is  reflected  in  the  fact  that  the  eigenvector  u  is  unique  up  to  a  complex  scalar. 
Thus,  angular  synchronization  yields  the  desired  phase  vector  up  to  a  global  phase 
ambiguity.  Bandeira  et  al.  [3]  prove  a  stability  guarantee  for  Algorithm  3  in  terms 
of  the  spectral  gap  A2  of  the  graph  G.  As  such,  we  don’t  lose  much  in  the  relaxation 
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provided  the  underlying  graph  is  sufficiently  connected.  We  state  their  result  here 
(without  proof)  for  completeness: 


Theorem  5.4  (Theorem  6.3  in  [3]).  Consider  a  graph  G  =  (V,  E)  with  spectral 
gap  A2  >  0  and  define  ||#||t  :=  minfc6Z  \9  —  2nk\  for  all  9  G  R/2nZ.  Furthermore, 
let  Ai  denote  the  weighted  adjacency  matrix  of  G,  defined  entrywise  in  (45).  Then 
Algorithm  3  outputs  the  estimate  ca  G  with  unit-modulus  entries  such  that,  for 
some  6  G  M/27rZ, 


ii  ars(^) 

i&V 


arg (ui)  -  9\\\  < 


p2  Ai 


where  P  :=  minpj)^  1  +  £tj\  and  C  is  a  universal  constant. 


5.3  Formulating  the  phase  error  problem  with  graphs 

Recall  the  concluding  discussion  of  Section  5.1,  in  which  we  combined  several 
received  radar  signals  with  their  estimates  to  produce  a  product  of  their  phase  er¬ 
rors  (44).  In  particular,  this  product  of  phase  errors  is  a  quotient  of  relative  phases. 
With  angular  synchronization  in  hand,  we  will  see  that  this  nesting  of  relative  phases 
suggests  a  way  to  use  angular  synchronization  to  recover  the  phase  errors. 

First,  we  show  how  to  organize  the  available  phase  information  using  graphs. 
To  this  end,  consider  the  graph  G  =  (V,  E )  for  some  finite  set  V.  Let  {c be  a 
vector  of  unknown  phase  errors  and  a  set  of  unknown  relative  phases. 

Now  define  a  second  graph  G'  =  (V7,  E')  such  that  V'  =  E,  and  consider  the  set  of 
known  relative  phases  { <?jl }  (p,  j) . (j,k))e e> ,  where  <Xy  :=  tOiUjf1  for  every  (i,j)  G  E. 
Note  from  Fact  5.2  that  we  have  cOiU)J2u)k  =  f°r  every  ((lj),  (j,  k))  G  E' ,  and 

so  G  and  G'  together  encode  a  nested  set  of  relative  phases.  Hence,  one  may  seek 
to  first  apply  Algorithm  3  on  G'  to  determine  the  edge  measurements  for  G  (up  to 
a  global  phase),  at  which  point  applying  the  algorithm  a  second  time  on  G  would 
determine  the  phase  errors  {c (up  to  yet  another  global  phase).  Note  that  this 
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Figure  5:  Two  aircraft  imaging  a  scene  of  interest  (left)  using  bistatic  SAR  techniques  at  three  different  times.  As  in 
Example  5.5,  at  the  first  time  instant  the  aircraft  are  located  at  positions  a  and  c,  while  the  second  and  third 
time  instances  correspond  to  location  pairs  (6,  d )  and  (c,  e),  respectively.  To  form  the  corresponding  graphs 
G  =  (V,  E)  and  G'  =  (V',Ef)  (right),  note  that  Fact  5.2  states  that,  for  each  location  pair,  we  receive 
relative  phase  information  relating  the  positions  to  their  bisector.  Hence,  each  location  pair  contributes 
an  edge  to  G for  instance,  the  pair  (a,  c)  dictates  that  (a,  b)  (6,  c).  Furthermore,  each  of  the  vertices 
in  G'  corresponds  to  an  edge  in  G.  Here,  two  vertices  are  adjacent  if  one  is  the  bisector  of  a  location  pair 
that  the  other  belongs  to.  In  this  way,  edges  in  G  represent  relative  phases  of  the  form  ,  while  edges 

in  G'  encode  the  recorded  quotients  of  relative  phases  fc.  Angular  synchronization  provides  a  means 

of  retrieving  (ij)ev'  UP  to  a  global  phase  from  &ijO‘j’k}((ij),(jk))€Ef  •  Encoding  the  resulting 

vertex  measurements  on  the  edges  in  G  then  allows  {tJi}i£v  to  be  obtained  (up  to  a  global  phase  and 
partial  modulation)  by  a  second  application  of  angular  synchronization. 


process  relies  heavily  on  the  connectivity  of  G',  which  is  inherited  from  properties  of 
G.  As  we  will  see,  connectivity  in  G'  is  not  easily  obtained.  Regardless,  the  idea  of 
using  angular  synchronization  on  both  graphs  motivates  the  approach  of  this  section. 

Example  5.5.  Suppose  two  aircraft  image  the  scene  in  Figure  5  along  the  synthetic 
aperture  from  point  a  to  point  e.  In  particular,  the  two  aircraft  image  the  scene 
with  location  pairs  (a,  c),  (b,d),  and  (c,  e)  (meaning  the  first  aircraft  is  at  position 
a  when  the  second  is  at  c,  and  so  on).  Based  on  Fact  5.2,  we  record  a  product  of 
phase  errors  of  the  form  (Tijcrjl  at  each  location.  To  organize  this  information,  we 
take  the  graphs  G  =  (V,  E )  and  G’  =  {V\  E'),  depicted  in  Figure  5,  such  that 


V  =  { a,b,c,d,e },  E  =  {(a,  6),  (6,c),  (c,d),  (d,e)} 
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and 


V  =  E=  {(a,b),(b,c),(c,d),(d,e)}, 

E'  =  |  ((a,  b),  (b,  c)),  ((b,c),(c,d)),  ((c,d),(d,e))J. 

In  this  way,  each  vertex  in  V  represents  an  unknown  phase  error,  each  edge  in  E 
(equivalently,  vertex  in  V')  represents  an  unknown  relative  phase  between  phase 
errors,  and  each  edge  in  E'  a  known  quotient  of  such  relative  phases  obtained  using 
Fact  5.2.  Thus,  performing  angular  synchronization  on  G'  and  then  on  G  will  return 
an  estimate  of  the  phase  errors  with  two  degrees  of  ambiguity.  □ 

Note  that  the  graphs  G  and  G'  of  Example  5.5  are  both  nearly  cyclic;  indeed, 
both  cycles  would  be  completed  if  the  location  pair  (d,  a)  were  added  to  the  system. 
For  this  reason,  suppose  we  allow  two  aircraft  imaging  a  scene  to  follow  a  circular  path 
enclosing  the  target,  maintaining  their  relative  positions  throughout.  Depending  on 
the  distance  between  the  aircraft,  the  graphs  generated  by  the  bistatic  phase  errors 
are  then  cycles,  where  the  number  of  vertices  is  equal  to  the  number  of  location  pairs 
at  which  the  scene  is  imaged.  Allowing  for  multiple  pairs  of  aircraft  to  perform  such 
a  maneuver  along  this  circular  path  then  creates  multiple  cycles  in  G,  each  creating 
a  separate  cycle  in  G' .  Hence,  more  planes  effectively  increases  the  connectivity 
of  G,  which  means  the  second  application  of  Algorithm  3  will  be  more  stable  by 
Theorem  5.4.  It  is  for  this  reason  that  we  focus  on  a  particular  class  of  graphs, 
namely,  circulant  graphs: 

Definition  5.6.  A  simple  graph  G  =  (V,  E)  is  said  to  be  circulant  if  its  adjacency 
matrix  A  is  a  circulant  matrix.  That  is,  there  exists  a  mapping  a:  Z\V\  — >  {0,1} 
such  that  A[i,j]  =  a(j  —  i)  for  all  i,  j  e  {0, . . . ,  \V\  —  1}. 

Circulant  graphs  enjoy  a  lot  of  useful  mathematical  properties.  For  example, 
circulant  graphs  whose  vertex  sets  are  of  prime  order  have  a  convenient  structure  in 
terms  of  the  cycles  they  contain. 
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Lemma  5.7.  Let  G  be  a  simple,  circulant  graph  with  n  vertices.  If  n  is  an  odd 
prime,  then  G  decomposes  into  copies  of  the  n-cycle. 


Proof.  Let  A  denote  the  adjacency  matrix  of  G.  Then,  because  G  is  circulant, 
there  exists  a  mapping  a:  7Ln  — »  {0,1}  such  that  A[i,j]  =  a(j  —  i)  for  all  i ,  j  G 
{0, . . . ,  n  —  1}.  Since  the  adjacency  matrix  of  a  simple  graph  is  symmetric,  we  also 
have  a{x)  =  a(—x)  for  all  Moreover,  the  fact  that  the  diagonal  entries  of  A 

must  vanish  implies  a(0)  =  0.  Let  <5*:  7Ln  — >  {0, 1}  such  that 


Si(x) 


{1  if  x  —  i 
0  otherwise 


for  all  i  G  {0, . . . ,  n  —  1}.  Then  a  decomposes  as 

LfJ 

a(x)  =  ^2  a{i)8i{x)  =  +  <5-j)(x),  (48) 

iGZn  i=  1 

where  the  last  equality  follows  from  the  assumption  that  n  is  odd.  Now  consider  the 
set  S  :=  {1  <  s  <  LfJ  :  a(s)  =  1}  and  note  that,  by  the  properties  of  a  listed  above, 
(48)  may  be  written  as 


a(x)  =  ^2  a(s)(^  +  8-s)(x)  =  +  <5-s)(x)- 

s£S  sGS 

Thus,  defining  the  matrices  {AJses  entrywise  by  As[i,j]  :=  (Ss  +  h_s)(j  —i),  we  can 
decompose  A  as  follows: 


A[i,j\  =  a(j  -i)  =  E(^  +  M(i~*)  =  ^2As[Lj\- 

sGS  sGS 

We  claim  that  each  As  is  the  adjacency  matrix  of  an  n-cycle,  from  which  it  follows 
that  G  decomposes  into  \S\  copies  of  Cn.  To  prove  this  claim,  note  that  for  any  sGS, 
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As\i,  j]  =  1  whenever  j  —  i  =  ±s.  Hence,  consider  the  vertex  labeling  0:  Zn  — >  Zn 
defined  by  0(i)  :=  s~1i  for  all  i  G  Zn,  where  is  the  multiplicative  inverse  of  s 
in  Zn  (note  that  this  is  a  well-defined  bijection  since  s  is  nonzero  and  n  is  prime, 
implying  Zn  is  a  field  and  s  is  invertible).  Furthermore,  since  0  is  linear,  j  —  i  =  ±s 
if  and  only  if 


4>{j)  ~  <£(*)  =  4>{j  ~  i)  =  0(±s)  =  ±0(s)  =  ±1, 

meaning  that  As[i,j]  =  1  if  and  only  if  —  00)  =  ±1.  Thus,  each  As  is  isomorphic 
(by  the  corresponding  0)  to  the  standard  n-cycle,  as  desired.  □ 

Now  that  we  have  a  better  understanding  of  circulant  graphs,  we  apply  this 
understanding  to  solve  the  phase  error  problem  in  the  noiseless  case.  To  this  end, 
let  G  =  ( V. ,  E)  be  circulant  with  n  vertices,  where  n  is  an  odd  prime.  Also,  consider 
the  graph  G'  =  (V1,  E')  where  V'  =  E  and  (i,j),  ( k ,  £)  G  V'  are  adjacent  if  and  only 
if  they  share  a  neighbor  in  a  common  n-cycle  in  the  decomposition  of  G  given  by 
Lemma  5.7.  Thus,  edges  in  E'  are  of  the  form  ( j ,  k))  for  i,j,  k  G  Cn  C  G.  For 

instance,  one  way  in  which  this  can  be  implemented  for  multistatic  SAR  is  to  allow 
c  +  1  aircraft  to  circle  a  target  scene,  each  performing  monostatic  SAR;  if  the  first 
aircraft  in  the  formation  transmits  a  bistatic  signal  for  each  of  the  other  aircraft  to 
receive,  then  the  resulting  graph  G  will  be  circulant  with  c  cycles. 

To  better  develop  our  understanding  of  the  phase  error  problem  with  circulant 
graphs,  we  first  focus  on  the  case  where  G  is  a  cycle.  To  this  end,  suppose  9 :  V  — > 
M/Z  is  any  function  on  the  vertices  of  an  n-cycle  in  G  and  consider  the  function 
M/Z  defined  by  9'((i,j))  :=  9(j)  —  9(i)  for  every  (i,j)  G  V'.  To  be  clear, 
6'  encodes  differences  (modulo  1)  in  the  value  of  6  at  adjacent  vertices  in  G,  and 
so  resembles  a  finite  difference.  Similarly,  consider  the  function  9":  E'  — »  M/Z 
defined  by  (k,£)))  :=  9'{{k,£))  —  6'((i,j)),  which  encodes  the  same  types 

of  differences  in  the  value  of  91  at  adjacent  vertices  in  G'.  Since  we  may  identify  V 
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with  Zn  in  such  a  way  that  i,j  G  V  are  adjacent  whenever  j  —  i  =  ±1  mod  n,  it  is 
possible  to  redefine  9 ,  O' ,  and  9"  as  functions  on  Zn.  In  particular,  we  see  that  9’  and 
6"  act  on  any  x  G  Zn  such  that  9'{x)  =  9(x  +  1)  —  0(x)  and  9"{x)  =  0'(x  +  1)  —  9'{x). 

Note  that  the  function  9  may  be  made  to  represent  a  complex  phase  by  con¬ 
sidering  the  unit-modulus  number  e for  any  x  G  Zn.  With  this  in  mind,  9' 
and  9"  analogously  represent  the  type  of  nested  relative  phases  we  are  interested 
in.  (Indeed,  we  switch  to  0-notation  here  so  that  we  can  appeal  to  intuitions  of 
finite-differences  and  integration.  Note  that  9  lies  in  [0,1)  instead  of  [0,  27t)  clue  to 
the  choice  of  normalization.)  Thus,  it  is  reasonable  to  ask  whether  it  is  possible  to 
determine  9  (up  to  some  sort  of  global  ambiguity  having  two  degrees  of  freedom) 
given  only  the  values  of  9" .  This  leads  to  the  following  lemma: 

Lemma  5.8.  For  any  function  9\  Z n  — >■  M/Z,  let  O' :  Zn  — »  M/Z  and  9” :  Zn  — >  M/Z 
be  defined  by  9'{x)  :=  9{x  +  1)  —  9(x)  and  9"{x)  :=  0'(x  +  1)  —  0'(x).  Then  the 
values  {9"{x)}x&zn  determine  an  estimate  9  such  that  9{x)  =  z  +  —  +  9{x)  for  some 
z  G  M/Z  and  k  G  {0, . . . ,  n  —  1}. 

Considering  the  above  discussion,  the  implication  of  this  result  is  that  knowing 
the  relative  phase  measurements  represented  by  the  edges  in  G'  determines  the  phases 
encoded  at  the  vertices  of  any  cycle  in  G  up  to  a  modulation  and  a  global  phase 
factor,  not  unlike  the  result  of  two  iterations  of  angular  synchronization  alluded 
to  in  Example  5.5.  Indeed,  the  processes  9"  i->  9'  and  9'  i— >■  9  are  just  angular 
synchronization  in  the  noiseless  case.  To  clarify,  the  estimate  produced  by  Lemma  5.8 
yields 


^2ni§(x)  ^2iriz ^27r\kx / n ^2iriQ (x) 

for  some  z  G  M/Z;  hence,  the  values  of  e2m^x)  are  a  modulation  of  the  actual  values 
of  e2”^1),  multiplied  by  a  global  phase.  The  global  phase  is  precisely  the  ambiguity 
associated  with  the  second  application  of  angular  synchronization  (on  the  graph  G), 
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while  the  modulation  is  a  consequence  of  propagating  the  ambiguity  from  the  first 
application  of  angular  synchronization  (on  the  graph  G ')  through  the  second.  In 
the  noiseless  case,  this  process  is  particularly  clean;  in  fact,  a  single  n-cycle  in  G  is 
sufficient: 

Corollary  5.9.  Let  G  =  ( V. ,  E)  be  an  n-cycle  and  G'  =  (' V E')  where  V1  =  E  and 
( i,j ),  (k,£)  G  V  are  adjacent  if  and  only  if  they  share  a  vertex  as  edges  in  G.  Fur¬ 
thermore,  let  {u>i}i£v  be  a  vector  of  unknown  phase  errors,  a  set  of  un¬ 
known  relative  phases,  and  a  set  of  known  relative  phases,  where 

(Jitj  :  =  uJiLoj1  for  every  ( i,j )  G  E.  Then  the  measurements  de¬ 

termine  oj  up  to  a  modulation  and  a  global  phase  factor. 

Proof.  Let  g:  V  — >  M/Z  such  that  c Oi  =  e 27rl9h)  for  every  i  G  V  and  define  the 
functions  g ' :  V'  — >•  M/Z,  g" :  E'  — >  M/Z  by 

sfdhj))  ■=  M  ~9{i)  and  :=  g\{k,£))  -  g\(i,  j)). 

Note  that  these  definitions  imply 

coicoj1  =  for  all  (i,j)  G  E, 

=  for  all  ( j ,  k))  G  E' , 

where  we  have  implicitly  assigned  a  direction  to  each  edge  in  G  and  G' .  In  particu¬ 
lar,  the  set  of  measurements  { v%.j vjl } ((i:y), (j,k))eE'  completely  determines  the  values 
j),  ( 3 , 

Since  G  is  an  n-cycle,  there  exists  a  vertex  labeling  <f>\  V  — >  Zn  such  that 
i,j  G  V  are  adjacent  if  and  only  if  <f>(j)  —  <f>(i)  =  ±1  mod  n.  Consider  the  functions 
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d:  Zn  — >■  M/Z,  9' :  Zn  — >■  M/Z,  and  9" :  7Ln  — >  M/Z  defined  earlier,  so  that 

0(x)  :=^(^1(^)), 

*"(*)  ^/(^(x  +  l))  -/^-1^)). 

Since  the  values  {9"(x)}xez„  are  also  completely  determined  by  the  set  of  mea¬ 
surements  {orjjO'yfc}((i,i),(yfc))e-B',  applying  Lemma  5.8  yields  the  estimate  9{x)  = 
z  +  yf  +  0(x)  for  some  z  G  M/Z  and  k  G  {0, . . . ,  n  —  1}.  Thus,  we  have 

(2;.  •— g27rig(i)  _  _  e27ri(z+^hf+0(0(*))) 

_ ^2tv\z  /n ^2tviz  ^2nik<j){i)  /  n  ^ 


completing  the  proof.  □ 

This  result  is  not  entirely  surprising  from  the  perspective  of  angular  synchro¬ 
nization,  since  we  have  already  seen  how  the  recovery  of  unknown  phases  from 
noiseless  relative  phase  measurements  is  possible  provided  the  resultant  graph  is 
connected.  Still,  Corollary  5.9  illustrates  the  power  of  Lemma  5.8  as  a  means  of 
analyzing  the  types  of  circulant  graphs  we  are  interested  in. 

Proof  of  Lemma  5.8.  Since  9'(x  + 1)  =  9"{x)  +  9'(x)  for  any  x  G  {0, . . , ,  n  —  1},  fixing 
the  estimate  d'(0)  iteratively  determines  the  set  {0'(x)}”“ q.  To  show  that  this  set 
is  consistent,  we  also  require  9'( 0)  =  9"(n  —  1)  +  9'(n  —  1).  For  this,  note  that  an 
inductive  argument  yields 

n—2 

9\n  -  1)  =  9"{n  -  2)  +  9'{n  -  2)  =  9"(k)  +  ^(°), 

k= 0 
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and  so  it  suffices  to  show  that 


n— 2  n— 2 

«"( n  -  1)  =  -  =  -  E^  +  !)  -  »'M) 

fc=0  fc=0 

=  0'(O)  -  9'(n  -  1)  =  0'(O)  -  0'(n  -  1), 

which  follows  from  the  definition  of  6" .  Hence,  there  exists  m  G  M/Z  such  that 
0'(re)  =  m  +  0'(re)  for  all  re  G  Zn. 

Similarly,  since  0(rc  +  1)  =  0'(rc)  +  0(rc)  for  any  re  G  {0, ... ,  n  —  1},  fixing 
the  estimate  0(0)  iteratively  determines  the  set  {0(rc)}”Zo  in  terms  of  the  estimates 
{0'(rc)}”Z 0)  in  particular,  a  telescoping  sum  in  the  definition  of  9(x)  gives 

6{x)  =  mx  +  9{x)  —  (0(0)  —  0(0)). 

Hence,  there  exists  z  G  M/Z  such  that  0( re)  =  z  +  mrc  +  0(rc)  provided  the  consistency 
relation  0(0)  =  9'{n  —  1)  +  9(n  —  1)  is  satisfied.  To  this  end,  an  inductive  argument 
first  yields 

n— 2 

9(n  -  1)  =  9'{n  -  2)  +  9{n  -  2)  =  0'(fc)  +  0(0), 

fc=0 

and  so  it  suffices  to  show  that 

n— 1  n— 1 

0'(n  -l)  =  -^2 &(k)  =  ~  5^0 m  +  0'(fc)) 

fc=0  fc=0 

n—  1 

=  —m(n  —  1)  —  Ew*  +  1)  —  0(fc))  =  —m(n  —  1)  +  0(0)  —  9{n  —  1). 

k= 0 

By  the  dehnition  of  6',  this  only  holds  if  run  =  0  mod  1;  that  is,  there  must  exist 
some  integer  k  such  that  m  =  k/n.  Therefore,  we  conclude  that  0( re)  =  z  +  —  +  0(rc) 
for  all  re  G  Zn,  where  z  G  M/Z  and  k  G  {0, . . . ,  n  —  1}.  □ 
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Notice  that  the  recovery  process  outlined  in  the  proof  of  Corollary  5.9  fails  when 
the  measurements  {(i,j),(j,k))&E'  are  corrupted  by  noise,  and  so  we  must  seek  a 

more  stable  solution.  For  this  reason,  we  return  to  the  more  general  case  where  G  is  a 
circulant  graph  with  n  vertices  such  that  n  is  an  odd  prime,  and  we  will  represent  the 
phase  information  in  a  way  that  is  more  compatible  with  angular  synchronization. 
Since  9  is  any  function  taking  vertices  in  V  to  real  numbers  modulo  1,  let  c o  := 
{u>i}i£v  C  T  be  a  complex  vector  of  phase  errors  such  that  uii  :=  for  every 

j  6  h.  Note  that  this  construction  is  well-defined,  and  so  we  will  treat  the  vector  u 
as  a  function  from  V  to  T.  Using  the  proof  of  Corollary  5.9  as  motivation,  consider 
the  decomposition  of  G  into  n-cycles  given  by  Lemma  5.7,  so  that  we  may  orient 
each  cycle  to  give  every  edge  a  direction.  Now,  redefine  the  function  9' :  V'  — »  M/Z  in 
terms  of  these  directions;  in  particular,  take  9'((i,j))  ■=  @{j)  —  9(i)  whenever  %  — >  j 
in  G.  Then 


(uu*)[i,j]  =  e2^»-Ui))  =  e27ri 

To  similarly  redefine  9",  let  B  C  E'  denote  the  set  of  all  ordered  pairs  of  consecutive 
edges  in  a  common  n-cycle  of  G  from  the  decomposition  above  and  take  9” :  B  — >  M/Z 
such  that 

«"(((*,  j),  U,  *0))  :=  «'(( J.  k))  -  »'((;,  j))  =  8(k)  -  2 8(j)  +  9(i). 

Hence,  we  have 

A;])-1  =  =  o"(((kJW,i)))} 

and  so  such  products  between  entries  of  c ou*  for  edges  in  B  may  be  used  to  estimate 
the  entries  of  u  by  applying  Lemma  5.8. 


Ill 


To  see  this,  note  that  Lemma  5.8  provides  a  means  of  producing  an  estimator 
for  6  based  on  the  given  values  of  6"  on  a  cycle.  Specifically,  for  each  cycle  C  one  may 
obtain  the  estimate  9(i)  =  z(C )  +  +  9(i )  for  some  cycle-specific  z(C)  G  M/Z  and 

k(C)  G  {0, . . . ,  n  —  1}.  In  particular,  for  each  cycle  C  in  the  decomposition  of  G,  the 
values  of  9"  determine  the  estimator  9'{{i,j))  =  m(C )  +  9'((i,j ))  for  all  (i,j)  £  C, 
where  m(C)  G  M/Z.  With  this  in  mind,  consider  the  matrices  Xc  and  X_c,  defined 
entrywise  by 


[  e— 27ri 

if  i  — >  j  in  C 

Xcli.j]  :  =  < 

\° 

otherwise, 

(  e2-rrW'((j,i)) 

if  j  — >  i  in  C 

1 

O 

0 

otherwise 

for  each  n-cycle  C  in  the  decomposition  of  G.  Then  the  collection  of  matrices 
{Xc,  X-c}ccg  provides  a  means  of  estimating  the  off-diagonal  entries  of  the  outer 
product  lulu*. 

Let  A  denote  the  adjacency  matrix  of  G  and  consider  the  subspace  T  of  the 
vector  space  of  self-adjoint  n  x  n  matrices  HPxn  defined  by 

T  :=  {X  G  Hnxn  :  X[i,j]  =  0  whenever  A[i,j]  =  1}. 

Notice  that  the  nonzero  entries  of  any  matrix  X  G  T  do  not  coincide  with  those 
of  any  matrix  in  the  collection  {Xc,  X_c}c<zg-  Thus,  defining  the  Minkowski  sum 
S  :=  span{Xc,  X_c}c<zg  +  T,  it  follows  that  lulu*  G  S  (for  the  noiseless  case). 

Finally,  since  the  diagonal  entries  of  lulu*  are  all  equal  to  1,  defining 

( 1  if  x  =  i 
St(x)  :=  < 

0  otherwise. 
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for  each  i  e  {0, ...  ,  n  —  1}  yields  (lulu*,  <5j5*)hs  =  1  for  every  i  =  0, . . .  ,n  —  1. 
Combining  this  with  the  above  result,  we  see  that  one  may  obtain  an  estimate  for 
the  outer  product  lulu*  by  seeking  a  rank-1  matrix  as  close  to  the  subspace  S  as 
possible  whose  diagonal  entries  are  nearly  1.  This  leads  to  the  following  feasibility 
problem  for  some  fixed  tolerance  £  >  0;  here,  VyX  denotes  the  projection  of  X  onto 
the  subspace  V : 

Find  X  G  Hnxn  such  that  rank(X)  =  1 

n—  1 

and  ||7VX||hs  +  El1  -  <^A*>hs|2  <  £•  (50) 

i=0 

Due  to  the  tolerance  £  >  0,  we  expect  that  the  above  program  would  be  particularly 
stable  to  noise  on  the  order  of  e.  Unfortunately,  this  program  is  not  convex  and 
so  is  not  easily  solved.  To  make  it  convex,  one  could  relax  the  rank-1  condition 
to  a  posit ive-semidefinite  requirement,  but  this  makes  the  feasibility  set  too  large; 
indeed,  the  identity  matrix  is  necessarily  feasible  for  the  relaxed  problem  but  has  full 
rank.  Properly  relaxing  the  problem  remains  an  open  problem.  Note  that  the  outer 
product  of  any  modulation  of  c u  is  feasible  in  (50).  As  such,  any  convex  combination 
of  these  will  be  feasible  in  a  convex  relaxation.  In  Appendix  B,  we  show  how  to 
determine  c u  up  to  a  modulation  and  a  global  phase  factor  from  any  such  convex 
combination. 

5.4  Solving  the  phase  error  problem  in  the  noisy  case 

The  process  of  determining  phase  errors  from  nested  relative  phases  via  the 
feasibility  problem  (50)  of  the  previous  section  appears  to  be  quite  difficult.  For  this 
reason,  we  seek  the  phase  errors  by  another  method.  In  particular,  we  will  obtain 
them  by  applying  angular  synchronization  iteratively. 

Let  lu  =  {LUi}i£v  CTbea  vector  of  unknown  phase  errors  and  consider  the 
graphs  G  =  (V,E)  and  G'  =  ( V',E ')  from  the  previous  section,  where  G  is  circu- 
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lant  with  n  vertices  such  that  n  is  an  odd  prime.  Thus,  we  encode  the  unknown 
relative  phases  as  edge  measurements  in  G,  namely  and  similarly 

for  the  known  (noisy)  relative  phases  in  G',  +  £(i,j),(j,k)}((i,j),{j,k))eE' ,  where 

<Tjj  :=  uJiLoj1  for  all  i,j  G  {(),  . . . ,  n  —  1}  and  each  £(i,j),(j,k)  is  an  adversarial  noise 
term  (under  the  assumption  that  £(i,j),(j,k)  —  £(k,j),{j,i))-  Since  Algorithm  3  (angular 
synchronization)  requires  the  input  graph  be  directed,  suppose  we  arbitrarily  direct 
the  edges  of  G  and  G'  such  that  edges  in  a  common  n-cycle  have  the  same  orienta¬ 
tion.  To  be  consistent,  this  means  we  take  only  the  relative  phases  cuicuj1  if  (i,  j)  G  E 
or  ujjco^1  =  coicoj1  if  (j,i)  G  E  (likewise  for  Uijcrjl  +  £(i,j),(j,k)  as  edges  in  E'). 

Unfortunately,  the  structure  G’  inherits  from  G  is  not  conducive  to  angular 
synchronization.  In  fact,  each  n-cycle  in  the  decomposition  of  G  corresponds  to  a 
distinct  n-cycle  in  G',  each  of  which  forming  its  own  component.  Thus,  G'  itself 
is  not  connected  and,  therefore,  has  a  spectral  gap  of  0.  For  this  reason,  it  doesn’t 
even  make  sense  to  perform  angular  synchronization  on  G'.  We  can,  however,  use  the 
cycles  in  G',  along  with  the  corresponding  edge  measurements,  as  separate  inputs 
for  Algorithm  3.  Since  the  vertices  in  G’  correspond  to  edges  in  G,  the  outputs 
then  estimate  certain  entries  of  the  phase  error  outer  product  cucu*,  specifically  those 
represented  by  the  edges  of  each  cycle  in  G.  By  Lemma  5.8,  it  follows  that  the 
information  encoded  on  the  edges  of  distinct  cycles  in  G  are  obtained  up  to  their  own 
global  phase  factors.  Assuming  one  could  “synchronize”  these  phases  in  such  a  way 
that  each  cycle  has  the  same  global  phase,  it  would  then  be  possible  to  use  G  and  the 
relative  phases  {uiicuj1} as  inputs  in  Algorithm  3  to  obtain  an  estimate  for  the 
phase  error  vector  uj.  Admittedly,  this  is  the  most  ad  hoc  portion  of  the  phase  error 
recovery  process;  indeed,  the  feasibility  problem  (50)  was  considered  as  a  possible 
way  to  circumvent  this  issue.  The  problem  with  this  intermediate  step  is  that  the 
global  phase  factors  associated  with  each  cycle  in  G  interact  differently  depending  on 
the  cycle  orientations.  To  see  this,  note  that  each  n-cycle  in  G  necessarily  contains 
every  vertex  of  G,  and  so  traversing  a  given  cycle  along  adjacent  edges  generates  a 
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unique  walk  in  G  which  reaches  every  vertex  exactly  once  before  repeating.  Because 
each  cycle  is  distinct,  however,  the  number  of  steps  necessary  to  reach  a  certain 
vertex  from  the  same  starting  vertex  in  any  two  walks  is  different.  Recalling  that 
phase  propagates  along  edges,  this  means  that  the  accumulated  phase  due  to  the 
global  phase  factors  varies  from  cycle  to  cycle.  Overall,  it  would  be  much  better 
to  estimate  the  relative  phases  from  in  one  steP 

instead  of  synchronizing  components  of  G  individually  before  synchronizing  their 
outputs.  In  the  absence  of  a  better  alternative,  we  continue. 

To  synchronize  the  cycles,  we  take  advantage  of  the  circulant  property  of  the 
adjacency  matrices  of  each  cycle  in  G;  in  particular,  note  that  it  is  possible  to  obtain 
common  support  amongst  these  matrices  simply  by  raising  each  to  a  particular  non¬ 
negative  power.  These  exponents  represent  the  different  number  of  steps  necessary 
to  move  in  each  cycle  in  order  to  reach  the  same  vertex  given  a  common  starting 
vertex.  Since  cycles  must  exhibit  consistency  in  relative  phase  (i.e.,  the  product  of 
relative  phase  between  any  two  vertices  is  constant  regardless  of  the  cycle  chosen), 
raising  each  matrix  to  the  appropriate  power  without  the  global  phase  factors  should 
yield  a  set  of  identical  circulant  matrices.  However,  the  fact  that  the  exponents  differ 
means  that  the  global  phase  factors  are  not  related  linearly.  For  instance,  to  move 
between  adjacent  vertices  in  one  cycle  may  require  two  steps  in  another  cycle,  and 
so  their  corresponding  phase  factors  have  a  quadratic  relationship;  others  may  yet 
have  even  higher  order.  To  determine  each  cycle  up  to  a  common  global  phase,  it  is 
therefore  necessary  to  incorporate  all  of  these  phase  relationships. 

Let  SCN  denote  a  set  of  indices  such  that  each  distinct  n- cycle  in  G  is  labeled 
Ci  for  some  i  €  S.  Furthermore,  assume  that  two  n-cycles  Ct  and  C3  in  G  are  distinct 
if  and  only  if  i,j  e  S  and  i  ^  j.  Consider  the  function  s:  S  x  S  — >  Zn  which  outputs 
the  exponent  such  that  Xq,  and  Xq1^  have  common  support;  here,  Xq  denotes 
the  weighted  adjacency  matrix  of  the  cycle  Ci  defined  in  (49).  Then  by  the  above 
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discussion  there  exists  a  function  a:SxS-tT  such  that 


a(t,j)Xc,  = 

for  all  i ,  j  G  S.  Since  these  weighted  adjacency  matrices  are  related  to  the  actual 
adjacency  matrices  of  the  cycles  by  distinct  global  phase  factors,  denoting  the  phase 
factor  associated  with  the  cycle  Cj  by  A  then  yields  the  relation 

a(i,j)  = 

A  '/V 

for  all  i ,  j  G  S.  Rearranging,  we  see  that  the  phases  {A}ies  may  be  expressed  as 
A  =  «(t,  for  all  i,j  G  S.  Thus,  we  require  an  initial  guess  for  one  of  the 

phases  in  order  to  generate  the  entire  set.  Such  a  guess  is  not  an  issue,  since  it 
merely  assigns  an  arbitrary  phase  to  one  of  the  cycles  in  G ,  in  terms  of  which  the 
remaining  phase  factors  are  determined.  To  be  clear,  we  already  know  there  must 
be  an  ambiguity  in  the  result  regarding  a  single  global  phase  factor,  making  this 
step  legitimate.  Rather  than  simply  choosing  a  random  phase  to  generate  the  set 
of  phases,  it  makes  sense  to  choose  a  particular  A  such  that  it  scales  Xci  to  have 
phase  1;  i.e.,  we  choose  /3^n  to  be  equal  to  the  product  of  the  nonzero  entries  of 
Xq,  for  some  fixed  i  G  S.  Using  this  fixed  phase  to  generate  the  remaining  phases 
then  yields  the  set  {A}ieS  UP  to  a  single  global  phase;  we  outline  this  process  in  its 
entirety  in  Algorithm  4.  As  we  saw  in  Lemma  5.8  and  Corollary  5.9,  the  consequence 
of  this  global  phase  is  to  ultimately  cause  the  estimated  phase  error  vector  to  be  a 
modulation  of  the  true  phase  error  vector. 

Now  that  we  have  recovered  the  phase  error  vector  up  to  a  global  phase  and 
modulation,  a  SAR  image  may  be  reconstructed  using  a  simple  algorithm,  which 
we  now  discuss.  Recall  that  the  phase  error  vector  provides  the  phase  factors  asso¬ 
ciated  with  slices  of  the  Fourier  transform  of  the  target  image  obtained  according 
to  Fact  5.2.  To  recover  the  modulation,  we  leverage  the  effect  that  modulating  the 
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Algorithm  4  Cycle  synchronization 

Input:  Weighted  n  x  n  adjacency  matrices  {Wjaes  defined  in  (49) 

Output:  Vector  {/3j}ies  of  cycle-dependent  phase  factors 
Initialize  \S\  x  \S\  matrices  s  and  a  of  all  zeros 
for  k,j  =  1  to  |  S')  do 

Compute  p  G  Zn  and  c  G  T  such  that  cXck  =  Xq 
Assign  s[k,j]  <—  p  and  at[k,j\  G-  c 

end  for 

Fix  £  E  S 

Compute  Pe.  G-  XCe[k,j])  l/n  {Initialize  fa  in  terms  of  the  product 

of  all  nonzero  entries  of  Xce} 

for  k  —  1  to  |,Sj,  k  ^  £  do 
Compute  /3k  G-  a[k,£]/3^k’^ 

end  for 

Output:  {f3i}ies 


slices  in  the  Fourier  domain  has  on  the  total  variation  of  the  image  in  the  spatial 
domain.  The  total  variation  of  an  image  is  the  sum  of  the  absolute  values  of  the 
differences  between  adjacent  pixels  in  the  image.  Since  a  modulation  of  the  slices  in 
the  Fourier  domain  is  a  pointwise  multiplication  of  the  Fourier  transform,  it  has  the 
effect  of  a  convolution  in  the  spatial  domain  with  some  function,  essentially  blurring 
the  image.  Thus,  sharp  contrasts  in  the  image  are  reduced;  in  other  words,  those 
adjacent  pixels  which  contribute  more  to  the  total  variation  are  changed  in  such  a 
way  that  their  contribution  is  reduced,  and  so  a  modulation  in  the  Fourier  domain 
reduces  the  total  variation  in  the  spatial  domain.  Consequently,  the  best  estimate 
for  the  true  modulation  is  that  which  maximizes  the  total  variation  of  the  image.  In 
order  to  be  independent  of  the  unknown  global  phase  factor,  we  actually  take  the 
modulation  which  maximizes  the  total  variation  of  the  absolute  value  of  the  image. 

As  for  the  global  phase  factor,  suppose  the  true  image  data  in  the  spatial 
domain  is  all  real  and  nonnegative.  Then  taking  the  two  dimensional  inverse  Fourier 
transform  of  the  data  with  the  estimated  modulation  determined  above  will  yield 
values  in  the  spatial  domain  that  are  (approximately)  scaled  versions  of  the  global 
phase.  Thus,  adding  all  pixel  values  and  normalizing  gives  an  estimate  for  the  global 
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Algorithm  5  Total  variation  maximization 

Input:  Slices  of  Fourier  transform  data,  associated  phase  errors  u  :=  up  to 

a  modulation  and  global  phase  factor 
Output:  Estimated  SAR  image  Y 

Let  X  denote  the  matrix  of  Fourier  transform  data 
Let  M  denote  the  matrix  of  slice  indices  {0, . . . ,  n  —  1} 
for  k  =  0  to  n  —  1  do 
Compute  u  4—  Eku 

Compute  X  4 —  X  o  cu(M)  {o  is  the  Hadamard  matrix  product} 

Compute  the  inverse  Fourier  transform  Y  4—  F~l X 
Compute  the  total  variation  TV}.  <—  ||F||tv 

end  for 

Compute  m  4—  arg  max*.  TV} 

Compute  u  4—  Emu> 

Compute  u}(M) 

Compute  the  inverse  Fourier  transform  Y  4—  F~1  X 
Compute  Ei  Ej  Yih  j]/\  Ei  Ej  Y [b  i]  I 
Output:  F  Re(,0“1F) 


phase  which,  combined  with  the  estimated  modulation  in  the  Fourier  domain,  will 
yield  the  best  estimate  for  the  image  itself  (after  an  inverse  Fourier  transform).  Due 
to  the  effect  of  noise,  we  take  the  real  part  of  the  resulting  image  data  to  approximate 
the  actual  image,  since  the  imaginary  parts  of  the  pixel  data  are  expected  to  be 
relatively  small.  This  procedure  is  outlined  in  Algorithm  5. 

Overall,  the  solution  to  the  phase  error  problem  we  have  developed  here  in¬ 
volves  a  particular  measurement  design  and  phase  error  reconstruction  procedure: 

Measurement  design 

•  Let  G  =  (V,  E)  and  G'  =  (V',E')  be  graphs  such  that  G  is  circulant  with  \V\ 
an  odd  prime,  V  =  E,  and  ( k,£ ))  G  V'  are  adjacent  in  G’  if  and  only 

if  they  share  a  neighbor  in  a  common  n-cyclc  in  the  decomposition  of  G  given 
by  Lemma  5.7 
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•  Design  a  multistatic  SAR  system  such  that  applying  Fact  5.2  yields  the  un¬ 
known  phase  errors  {u>i}i£v,  the  unknown  relative  phases  and 

the  known  combinations  {uJiUJJ2 04} 

SAR  image  reconstruction  procedure 

•  For  each  cycle  in  G',  use  Algorithm  3  to  calculate  the  relative  phases 
up  to  distinct  global  phase  factors  from  {u!iU>J2Uk}^ij)t(j,k))£E' 

•  Form  the  weighted  adjacency  matrices  for  each  n-cycle  in  G  using  the  relative 
phases  {cUiUiJ1} from  the  corresponding  cycles  in  G'  according  to  (49) 

•  Use  Algorithm  4  to  synchronize  the  cycle-dependent  phase  factors 

•  Use  Algorithm  3  to  calculate  the  phase  errors  {u>i}i£v  up  to  a  modulation  and 
a  global  phase  factor  from  the  synchronized  relative  phases  {cJiLuJ1} (ij)ev' 

•  Use  Algorithm  5  to  estimate  the  SAR  image  by  picking  the  modulation  that 
maximizes  total  variation 

To  test  the  phase  error  recovery  procedure  (i.e.,  the  first  four  bullets  of  the 
SAR  image  reconstruction  procedure  above),  we  simulated  the  recovery  of  random 
phase  errors  c Oi  from  noisy  products  of  the  form  u>iU)J2uJk  using  random  circulant 
graphs  on  101  vertices.  For  graphs  G  containing  a  fixed  number  of  cycles,  we  gener¬ 
ated  a  random  phase  error  vector  and  took  such  a  product  of  phases  for  each  edge  in 
the  corresponding  graph  G'.  We  then  added  complex  Gaussian  noise  to  each  prod¬ 
uct  (normalizing  the  resultant  entries)  to  simulate  noisy  SAR  data  acquired  from 
multistatic  schemes  according  to  Fact  5.2.  Finally,  using  Algorithms  3  and  4  as 
prescribed  above,  we  generated  the  estimated  phase  error  vector  (up  to  a  modula¬ 
tion  and  global  phase  factor).  Figure  6  depicts  how  the  performance  of  this  phase 
error  reconstruction  changed  with  the  number  of  cycles  in  the  graph  G.  Since  the 
global  phase  ambiguity  is  not  resolved  in  this  portion  of  the  image  reconstruction 
process,  the  relative  error  here  is  measured  between  the  outer  products  of  the  phase 
error  vector  and  the  phase  error  estimate,  which  we  then  minimized  over  all  possible 
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Figure  6:  Iterative  angular  synchronization  as  a  means  of  recovering  a  phase  error  vector  (up  to  a  modulation  and 
global  phase)  from  nested  relative  phase  information.  Here,  simulations  use  a  graph  with  101  vertices  and 
varying  numbers  of  cycles.  Note  that  the  global  phase  ambiguity  in  the  estimate  cannot  be  resolved,  and 
so  the  relative  errors  are  taken  between  the  outer  products  of  the  true  and  estimated  phase  error  vectors. 
Since  the  estimate  produced  is  also  a  modulation  of  the  true  phase  error  vector,  the  errors  depicted  are 
the  minima  over  all  possible  modulations.  At  left  we  plot  relative  error  in  the  output  as  a  function  of  the 
number  of  cycles  in  G ,  while  at  right  we  plot  computation  time  for  the  same  varying  set  of  cycles.  For  the 
first  row  of  plots  (top),  the  normalized  noise  is  fixed  in  average  magnitude  at  a 2  =  0.01,  while  the  second 
row  (bottom)  depicts  noise  fixed  at  cr2  =0.1.  Piecewise  linear  graphs  connecting  the  sample  averages  are 
shown  for  clarity. 


modulations.  Note  that  multiple  cycles  tend  to  reduce  the  error  in  reconstruction, 
though  with  diminishing  returns.  Since  the  computation  time  grows  with  each  cycle 
added  to  G,  this  suggests  the  existence  of  an  “optimal”  number  of  cycles  that  gives 
good  reconstruction  while  keeping  computation  time  low. 

To  get  an  idea  of  the  behavior  of  the  phase  error  recovery  procedure  in  the 
presence  of  noise,  we  also  simulated  the  recovery  of  random  phase  errors  from  noisy 
multistatic  SAR  data  in  different  noise  regimes  using  a  fixed  number  of  cycles.  In 
particular,  we  generated  random  circulant  graphs  containing  1,  5,  and  10  cycles, 
examining  the  performance  of  phase  error  recovery  in  each  case  under  varying  levels 
of  noise.  Figure  7  displays  the  observed  relative  error  in  output  as  a  function  of 
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Figure  7:  The  stability  of  iterative  angular  synchronization  is  cycle-dependent — more  cycles  imply  better  stability 
(and  each  cycle  comes  from  a  pair  of  aircraft  in  the  multistatic  system).  Here,  iterative  angular  synchro¬ 
nization  recovers  the  phase  error  vector  (up  to  a  modulation  and  global  phase)  from  simulated  nested 
relative  phase  information  using  a  graph  with  101  vertices  and  varying  numbers  of  cycles.  Since  the  es¬ 
timate  produced  is  necessarily  a  modulation  of  the  true  phase  error  vector,  the  errors  depicted  are  taken 
to  be  the  minima  over  all  possible  modulations.  Each  plot  depicted  shows  relative  error  in  recovery  as  a 
function  of  average  input  noise  magnitude  (normalized).  From  left  to  right,  the  graphs  used  contain  1  and 
5  cycles;  the  case  of  10  cycles  behaves  almost  identically  to  the  5  cycle  case.  Note  when  the  graph  contains 
one  cycle  the  algorithm  is  relatively  ineffective  at  determg  the  phase  error  vector  (up  to  any  modulation) 
regardless  of  the  level  of  noise  in  the  model;  as  such  stability  is  poor  for  this  case.  Meanwhile,  graphs  with 
greater  numbers  of  cycles  exhibit  better  stability  to  noise. 


input  noise;  as  illustrated,  the  phase  error  reconstruction  procedure  appears  to  be 
stable  to  noise  provided  the  graph  G  contains  multiple  cycles. 

Finally,  we  tested  the  last  bullet  of  the  SAR  image  reconstruction  procedure. 
To  do  this,  we  used  a  cropped  version  of  a  SAR  image  of  the  Pentagon  (available 
at  [83]).  Partitioning  the  pixels  of  its  Fourier  transform  into  101  sections,  each 
corresponding  to  a  different  slice  of  Fourier  space,  we  then  multiplied  each  section  by 
a  different  (noisy)  phase  factor  to  simulate  phase  errors.  Specifically,  we  generated  a 
vector  of  101  iid  random  phases  and  added  complex  Gaussian  noise,  normalizing  each 
resultant  entry.  Since  the  phase  error  recovery  procedure  only  guarantees  recovery  of 
the  true  phase  errors  up  to  a  modulation  and  global  phase  factor,  we  then  randomly 
modulated  this  vector  and  multiplied  by  a  random  phase;  pointwise  multiplying 
each  entry  of  this  vector  by  the  corresponding  slice  of  the  image’s  Fourier  transform 
then  simulates  the  output  of  the  first  four  bullets  of  the  SAR  image  reconstruction 
procedure.  Taking  the  actual  phase  error  vector  to  be  all  ones  (without  loss  of 
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(a) 


(b) 


Figure  8:  The  Fourier  domain  is  partitioned  into  101  lines  and  each  lines  is  given  a  simulated  (noisy)  phase  error. 

If  the  phase  errors  are  known  up  to  a  modulation  and  global  phase  factor,  then  Algorithm  5  recovers 
the  modulation  by  maximizing  total  variation  of  the  image,  (a)  The  (approximate)  probability  that  the 
estimated  modulation  is  not  the  true  modulation  (as  a  function  of  average  noise  magnitude);  here,  a 
sample  of  100  simulations  provide  the  estimated  probability  for  each  noise  level  in  {0,0.1,0.2,...,1}.  (b) 
The  relative  error  in  image  reconstruction  using  phase  errors  estimated  by  Algorithm  5  suggests  stability 
to  noise;  depicted  are  clusters  of  30  simulations  at  varying  noise  levels. 


generality),  we  then  gave  the  generated  phase  error  estimates  and  noisy  Fourier  data 
to  Algorithm  5,  the  results  of  which  are  depicted  in  Figures  8  and  9. 

As  we  can  see,  the  demodulation  portion  of  this  algorithm  is  particularly  stable 
to  noise;  indeed,  Figure  8(a)  suggests  that  the  total  variation  method  of  modulation 
recovery  has  a  success  rate  of  over  90%  for  noise  with  average  magnitude  of  up  to 
0.3.  Also,  Figure  8(b)  shows  that  relative  error  in  image  reconstruction  behaves 
well  with  the  input  noise.  Figure  9  illustrates  how  the  total  variation  of  the  image 
under  all  possible  modulations  is  affected  by  differing  levels  of  noise,  as  well  as  the 
resultant  quality  in  image  recovery.  Indeed,  the  noiseless  case  demonstrates  that 
the  true  modulation  is  easily  implicated  by  a  large  total  variation.  When  the  noise 
level  is  0.3  in  average  magnitude,  the  true  modulations  can  still  be  detected  by 
maximizing  total  variation,  but  the  sidelobes  are  rising  and  the  noise  is  somewhat 
noticeable  in  the  resulting  image  reconstruction.  Finally,  using  noise  with  average 
magnitude  of  0.7  raises  the  sidelobes  so  high  that  the  true  modulation  can  no  longer 
be  detected.  As  Figure  8(a)  indicates,  we  can  expect  the  true  modulation  to  be 
successfully  detected  only  50%  of  the  time  with  this  level  of  noise. 
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Figure  9:  Reconstructing  a  SAR  image  from  noisy  Fourier  data.  The  Fourier  domain  is  partitioned  into  101  lines 
and  the  image  information  on  each  line  is  given  a  random  (noisy)  global  phase.  The  resultant  simulated 
phase  error  vector  is  then  modulated  with  a  random  modulation  index.  To  recover  the  modulation,  the 
image  is  reconstructed  using  all  possible  modulations  and  that  which  yields  maximal  total  variation  in  the 
reconstruction  is  taken;  the  global  phase  factor  is  then  estimated  to  reconstruct  the  image  with  this  choice 
of  modulation  (see  Algorithm  5).  At  top  right  is  a  SAR  image  of  the  Pentagon  (noiseless);  the  images 
beneath  are  reconstructions  with  average  noise  magnitudes  0.3  and  0.7,  respectively.  Each  plot  at  left 
shows  the  total  variation  of  the  corresponding  image  as  a  function  of  modulation  index.  For  the  noiseless 
case,  the  maximizing  modulation  57  is  indeed  the  true  modulation.  With  noise  at  average  magnitude  of 
0.3,  the  total  variation  is  again  maximized  at  the  true  modulation  index  (13).  In  the  case  of  average  noise 
magnitude  0.7,  the  modulation  which  maximizes  the  total  variation  is  12,  while  the  true  modulation  is  45, 
resulting  in  the  unsuccessful  recovery  displayed.  The  relative  error  in  the  reconstructed  images  at  noise 
levels  of  0.3  and  0.7  above  are  0.0410  and  0.1667,  respectively. 
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VI.  Conclusion 


In  this  thesis  we  made  significant  theoretical  progress  on  the  phase  retrieval  problem. 
First,  we  analyzed  what  it  means  for  an  ensemble  of  intensity  measurements  to  be 
injective  and  stable.  In  doing  so,  we  characterized  injectivity  in  both  the  real  and 
complex  cases,  leading  to  the  conjecture  that  4 M  —  4  intensity  measurements  are 
necessary  and  sufficient  to  successfully  determine  an  M-dimensional  complex  sig¬ 
nal  (up  to  a  global  phase).  We  made  certain  contributions  toward  a  proof  of  this 
conjecture,  to  include  a  deterministic  construction  of  an  injective  ensemble  using 
AM  —  4  intensity  measurements.  Next,  we  devised  a  theory  of  almost  injectivity  to 
characterize  ensembles  of  intensity  measurements  that  enable  signal  recovery  on  a 
dense  subset  of  CM .  This  led  to  a  discussion  of  the  computational  limits  of  phase 
retrieval,  where  we  analyzed  computational  complexity  by  drawing  connections  be¬ 
tween  phase  retrieval  and  the  well-known  subset  sum  problem.  We  concluded  with 
a  stability  analysis,  where  we  developed  stronger  versions  of  the  characterizations 
of  injectivity  in  the  presence  of  stochastic  noise  as  well  as  a  new  condition  which 
strengthens  the  complement  property  in  the  worst  case. 

The  second  major  contribution  of  this  thesis  was  to  develop  a  new  multistatic 
methodology  for  synthetic  aperture  radar  to  resolve  phase  errors.  Drawing  mo¬ 
tivation  from  the  phase  retrieval  problem,  we  related  the  phase  error  problem  to 
interferometric  phase  recovery  techniques.  Using  graphs  to  organize  the  SAR  data, 
we  then  leveraged  angular  synchronization  to  reconstruct  the  phase  errors.  Simu¬ 
lations  suggest  that  image  reconstruction  based  on  this  approach  is  stable  to  noise, 
and  desirable  results  can  be  achieved  using  few  aircraft. 

Aside  from  this  thesis,  theoretical  progress  on  the  phase  retrieval  problem  is 
currently  an  active  area  of  research.  For  instance,  the  AM  —  A  Conjecture,  although 
widely  believed,  still  lacks  a  complete  proof.  The  following  summarizes  what  is 
currently  known  about  the  conjecture: 
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•  The  conjecture  holds  for  M  =  2  and  M  —  2m  +  1,  m  —  1, 2,  3, . . .  [38]  (see  also 
Section  2.1  of  this  thesis). 

•  If  N  <  AM— 2a(M— 1)— 3,  then  M  is  not  injective  [62];  here,  a(M— 1)  <  log2  M 
denotes  the  number  of  l’s  in  the  binary  expansion  of  M  —  1. 

•  For  each  M  >  2,  there  exists  an  ensemble  $  of  IV  =  4M  —  4  measurement 
vectors  such  that  A  is  injective  [17]  (see  also  Section  2.3  of  this  thesis). 

•  If  IV  >  AM  —  A,  then  A  is  injective  for  generic  $  [38]  (cf.  [8]). 

Furthermore,  there  remains  a  fundamental  lack  of  understanding  of  injectivity  in 
the  complex  case.  Although  injectivity  has  been  characterized  in  this  case  (see  Sec¬ 
tions  2.1  and  2.2  of  this  thesis),  formulating  sufficient  conditions  for  injectivity  in  the 
complex  case  which  can  be  verified  in  finite  time  remains  a  subject  for  future  work. 
Characterizing  almost  injectivity  for  complex  ensembles  is  also  an  open  problem, 
and  a  stability  analysis  for  phase  retrieval  in  this  setting  is  still  necessary. 

The  phase  error  problem  in  synthetic  aperture  radar  similarly  requires  future 
work.  In  particular,  alternative  ways  of  relaxing  the  feasibility  problem  (50)  to  en¬ 
able  the  use  of  convex  programming  techniques  is  highly  desirable.  If  this  is  not 
possible,  the  process  described  in  Section  5.4  requires  a  performance  guarantee,  and 
it  may  also  be  improved  upon.  Indeed,  the  intermediate  step  of  cycle  synchroniza¬ 
tion  to  link  the  two  graphs  G  and  G'  is  not  very  democratic;  an  alternative  method 
that  simultaneously  uses  information  from  all  available  cycles  in  G'  might  improve 
efficiency  and  stability  of  the  entire  algorithm.  Meanwhile,  the  phase  error  recov¬ 
ery  algorithm  presented  in  this  thesis  is  in  need  of  a  feasibility  assessment  to  better 
understand  the  possibility  of  real-world  implementation  in  SAR  imaging  systems. 
For  example,  what  types  of  (realistic)  aircraft  flight  patterns  enable  the  encoding  of 
multistatic  SAR  data  using  circulant  graphs?  Also,  are  there  other  types  of  graphs, 
which  may  be  more  easily  implementable,  that  exhibit  the  same  phase  error  recovery 
characteristics?  In  this  thesis,  we  assumed  that  we  knew  when  an  aircraft’s  bearing 
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vector  bisects  the  bearing  vectors  of  two  other  aircraft — how  stable  is  this  assump¬ 
tion  to  fluctuations  in  the  aircraft  position?  Further  issues  concerning  the  effect  of 
crosstalk  in  multistatic  SAR  systems  need  to  be  addressed  as  well,  and  questions 
regarding  simplifying  assumptions  should  be  considered  (e.g.,  two-dimensional  tar¬ 
get  scenes).  One  should  account  for  all  of  these  factors,  not  only  in  discerning  the 
feasibility  of  the  multistatic  methodology  we  present,  but  also  in  competing  with 
state  of  the  art  phase  error-correction  algorithms,  which  do  not  require  a  multistatic 
system  but  might  prove  to  be  less  reliable. 
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Appendix  A. 

Here,  we  verify  that  we  can  differentiate  under  the  integral  sign  in  the  proof  of 
Theorem  4.10  from  Section  4.2. 


Lemma  A.l.  Consider  the  probability  density  function  defined  by 


f(y,o) 


1  -Wy-Am2/^2 

(27rn2)^/2 


Vy  G  RN. 


Then  for  every  function  g :  RN  — >  M  with  finite  second  moment 


g(.y)2f(y,  ®)dy  <  oo 


V0  G  fi, 


we  can  differentiate  under  the  integral  sign: 


d 


d 


ww  /  g(y)f(v,0)dy=  /  g{y)—f{y]0)dy. 


86 


86 


Proof.  First,  we  adapt  the  proof  of  Lemma  5.14  in  [72]  to  show  that  it  suffices  to 
find  a  function  b(y ;  6)  with  finite  second  moment  such  that,  for  some  £  >  0, 


f(y i  6  +  zSj)  -  f(y ;  6) 
zf(y;0) 


<  b{y,  6) 


\/y  G  M.N ,  6  G  H,  \z\  <  e,  z  0  (51) 


where  Si  denotes  the  ith  identity  basis  element  in  M2A/.  Indeed,  by  applying  the 
Cauchy- Schwarz  inequality  over  /-weighted  L 2  space,  we  have 


\9{y)\b{y,0)f{y,0)dy 


< 


1/2 


g(y )  f(y ;  0)dyj  (  J  ^  b(y;  6)  f(y,  6)dy\ 


\  1/2 


<  oo 
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and  so  the  dominated  convergence  theorem  gives 


d  f 

9(y)^f(y,  6)dy  =  /  lim  (  g{y) 


89 


2— >0 


g(y ) 


=  lim  / 

z~+°  J i 

=  lim  - 

z-> o  z 

=  -tL  [  g(y)f(y,0)dy. 

OUj 


J{y\0  +  z5i)  -  f(y;9) 

zf{y,0) 

f{y,6  +  zSi)  -  f(y;  9) 


zf(y;0) 
g(y)f(y,o  +  z6i)dy 


f(y,Q)dy 
f(y,Q)dy 
g(y)f(y,  d)dy 


In  pursuit  of  such  a  function  b(y,9),  we  first  use  the  triangle  and  Cauchy- Schwarz 
inequalities  to  get 


f(y ;  d  +  zSi)  -  f(y ;  9) 

1 

P-^(\\y-A°+zSi)\\2-\\y-Am2)  i 

zf{y,9) 

\z\ 

O  z,cr  v  '  JL 

_ 1 

e  2<r 


<  r-rl 

z 


{\\A(9+z5i)\\2  —  \\A(0)\\'2—‘2(y,A(9+z5i)—A{9))} 

_ g^z{VrA(P+z6i)— A(6))  g-^2(y,A(6+z5i)—A(9))  

-^(ll^+^)l|2-||^)H2)  _  X 


(y,A(9+z6i)-A(d)) 


+ 


(y:A(9+zSi)—A(d)) 


1 


<  _ |  e^z\\y\\\\A(e+z&i)-A(e)\\ 

\z 


+ 


e-^\\y\\\\M8+zSi)-A{e)\\ 


-^(M(*+**)ll2-M(0)ll2)  _  1 

(52) 


Denote  c(z;9)  :=  ^\\A(9  +  zSi)  —  A(9) ||.  Since  ( est  —  1  )/t  <  sest  whenever  s,t  >  0, 
we  then  have 


le'l*;")'11'"  -  1|  c(z;  0)  eV*ll»ll  -  1  ^  c(z;  0)  „  „ 
\z\  \z\  '  c  (z-,0)  ~  \z\  11911 
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Also  by  l’Hospital’s  rule,  there  exist  continuous  functions  C\  and  C'2  on  the  real  line 
such  that 


Ci  (z;0)  = 


Atl,  c2(z ;  e)  =  ij 


\/z  ^  0. 


Thus,  continuing  (52)  gives 


f(y;9  +  z5i )  -  f(y;9) 


zf(y,9) 


<  (Cl{z-9)\\y\\+C2^9))e 


c,c{z-,0)\\y\\ 


Now  for  a  fixed  s,  take  C3(9)  :=  sup|2i<£  Cj{z\  9)  and  c[9)  :=  sup|2i<£  c(z;  9),  and 
define 

%;«)  :=  (Ci(»)||i/||  +C2(e))eoWw. 

Since  Cj(9)  and  c(9)  are  suprema  of  continuous  functions  over  a  bounded  set,  these 
are  necessarily  finite  for  all  9  €  As  such,  this  choice  for  b  satisfies  (51).  It  remains 
to  verify  that  b  has  a  finite  second  moment.  To  this  end,  let  B(R(9))  denote  the  ball 
of  radius  R(9)  centered  at  the  origin  (we  will  specify  R{9)  later).  Then 


f  b(y;9)2f(y;9)dy  =  I  b(y;9)2f(y;9)dy  +  [  b(y;9)2f(y;9)dy 

RN  J  B(R(9))  JRN\B(R(0)) 

<  [C1{9)R(9)  +  C2{9)y e2c{d)R{9) 

+  ,„  A/2  /  (Ci(9)||!/||+C2(9))>(',>M-^l"'-4miiy!/. 

(2vra2)7V/i  JRN\B{m)  V  ) 

(53) 


From  here,  we  note  that  whenever  ||t/||  >  2||^4(d)||  +  8 o2c[9\  we  have 


\\y  -  m\\2  >  M2  -  ny\\\\Am  +  \\m\\2 

>  (2MWII  +  8 <*(0))  Ibll  -  2||s/||m(9)||  +  II^WII2 

>  Sv2c(my\l 
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Rearranging  then  gives  2c(#)||t/||  <  -^\\ y  —  A(9) ||2.  Also  let  h{9)  denote  the  larger 
root  of  the  polynomial 

p(x]9)  :=2Ci(9)2(x2-2\\A(9)\\x  +  \\A(9)\\2^  ~  (Ci(9)x  +  C2(0))2, 

and  take  h(9)  :=  0  when  the  roots  of  p(x]9)  are  not  real.  (Here,  we  are  assuming 
that  Ci  >  0,  but  the  proof  that  (53)  is  finite  when  C\  —  0  quickly  follows  from  the 
C\  >  0  case.)  Then  (Ci(9)\\y\\  +  C2(9))2  <  2Ci(9)2\\y  —  A(9)\\2  whenever  ||y||  >  h(9), 
since  by  the  Cauchy-Schwarz  inequality, 

2Ci(0)2||y  -  ,4(0)||2  -  ((7,(9)111,11  +  C2(0))2  >  p(||s,||;0)  >  0, 

where  the  last  step  follows  from  the  fact  that  p(x;  6)  is  concave  up.  Now  we  continue 
by  taking  R(9)  :=  max{2||v4.(#)||  +  8 a2c(9),  h(9)}: 

[  (Ci{9)\\y\\  +C'2(0))2e2cWI|y||-^ll!/--A(0)l|2d?/ 

jRN\B(R(d))  '  ' 

<  f  2Ci(Q)2\\y  —  A(0)\\2e~ dy 

JRN\B(R(9)) 

<  (2vr(v/2n)2)7V/2  •  2 Ci(9)2  [  ||a;||2 - ^ -  -|MlV2(VS d 

“  V  J  V  "  "  (27T(v/2fT)2)7V/2 

where  the  last  step  comes  from  integrating  over  all  of  and  changing  variables 
y  —  A{9)  (->•  x.  This  last  integral  calculates  the  expected  squared  length  of  a  vector 
in  with  independent  A/"(0,2<r2)  entries,  which  is  2N  a2 .  Thus,  substituting  into 
(53)  gives  that  b  has  a  finite  second  moment.  □ 
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Appendix  B. 

Here,  we  compute  the  phase  error  vector  oj  :=  {c Uj}jey  C  T  from  a  solution  to  a 
relaxed  version  of  the  feasibility  problem  (50):  for  some  fixed  tolerance  £  >  0, 

Find  X  G  Hnxn  such  that  rank(X)  =  1 

n—  1 

and  \\VS±X\\^  +  ]T|1  -  (X,^)hs|2  <  £• 

*= o 

Note  that  the  outer  product  of  any  modulation  of  the  optimal  u  is  necessarily  feasible, 
and  so  relaxing  the  problem  to  make  it  convex  implies  that  any  convex  combination 
of  such  modulations  is  feasible  for  the  relaxed  version: 

Theorem  B.l.  Suppose  lu:  7Ln  — *  T  such  that  ujuj*  solves  a  convex  relaxation  of 
the  feasibility  problem  (50).  Then  YZhZq  \E1uoj*E~1  is  also  a  solution  for  every 
C  M  such  that  A*  =  1. 

To  be  clear,  the  modulation  operator  E  in  Theorem  B.l  is  the  n  x  n  diagonal 
matrix  whose  A:th  diagonal  entry  is  e2mk/n.  Before  proving  the  theorem,  however,  we 
hrst  consider  some  intermediate  results. 

Lemma  B.2.  Let  G  =  (V,  E )  be  a  circulant  graph  with  n  vertices  such  that  n  is  an 
odd  prime  and  u:  V  — >  T.  Then  \\Vs-l(Ecolj* E*)\\uS  =  ||7:,5±(a;a;*)||Hs- 

Proof.  By  Lemma  5.7,  G  decomposes  into  copies  of  the  n-cycle.  For  each  resultant 
n-cyclc  C,  define  the  matrices  Xq  and  X_c  as  in  (49)  and  recall  the  subspace  S  = 
span {Xc,X_c}ccg  +  T,  where  T  :=  {X  G  HPxn  :  X[i,j]  =  0  whenever  A[i,j]  =  1}. 
Since  no  two  n-cycles  in  the  decomposition  of  G  share  an  edge  and  each  nonzero 
entry  of  Xc  and  X_c  is  of  unit-modulus,  it  follows  that  { Xc ,  ~^X_c}ccg  is  an 
orthonormal  set.  Noting  that  for  i,  j  G  V  we  have 

S  =  span  ( { Xc ,  X_c } gcg  U 
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then  implies  that  the  set  {^X"c,  ^X_  c}ccG^{^j}i^j  forms  an  orthonormal  basis 
for  5. 

At  this  point,  note  that  the  Pythagorean  Theorem  gives 

II^IIhs  =  II^sMIIhs  +  I|tv(^c</)|||s, 

||£WS*fe  =  ||7’s(Bww*S*)fe  +  ||7’sJ.(£ww*S*)fe, 

and  so  to  establish  the  result  it  suffices  to  show  the  equalities  ||.Eu;u;*.E*||hs  = 
||o;c</||hs  and  \\Vs(Eujuj*  E*)\\nS  =  ||Ps(u;C(/)||hs-  To  this  end,  we  first  obtain 

\\Euu*E*\\2m  =  Ty[(Euu*E*)*Euu*E*]  =  Tr[Euu*  E*  Euu*  E*] 

=  Tr  [E*Euu*E*Euu*]  =  Tr[(cecX)*a;a/]  =  ||cucu*  ||  hs? 

at  which  point  taking  square  roots  gives  the  former  equality.  For  the  latter,  we  start 
by  again  applying  the  Pythagorean  Theorem  using  the  basis  for  S  given  above: 

||PS(JW£")&  =  X  (\{E^'E\  ^Ac}„s|2  +  | (£WE*,  ^A'_c>„s|2) 

CCG 

+  J2\(E“u*E*iM>s\2.  (54) 

To  simplify  this  expression,  notice  that 

(Euu*E*,Xc)  hs  =  Tr[X*cEuu*E*]  =  Tr  [E*  X*cEujuj*}  =  (uu*,E*XcE)  HS, 

where  (E*X^E)[i,j]  =  e2m^~l^nXc[i,  j}.  A  similar  computation  for  X_c  yields 
(E*XZcE)[i,  j]  =  e27rl(*-i)/«x_c[i,  j],  and  so  it  follows  that 

(Euu*E\X±c) HS  =  (uu*,E*X±cE) HS  =  e±2^~i)/n(uio* ,  X±c)m.  (55) 
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On  the  other  hand,  since  E*5i  =  e  27Tn^nSl:  we  have 

(Euu*E*,6tf)  HS  =  Tr  [(6$)*  Euu*  E*}  =  Tr  [(E*6i6*jE)*uu*} 

=  e27ri(i~j)/n  Tr^fl?)  W]  =  e2^-j)/n{ujuj*,  SiS*) HS  (56) 

for  any  i,  j  G  {0, . . .  ,  n  —  1}.  Substituting  (55)  and  (56)  into  (54)  then  yields 

||PS(£W-.E*)&  =  V  (|  (uu’,  ^Vc)„s|2  +  \{uiu‘,  ^A'_c>„s|2) 

CCG  V 

+  ^  | (caca*,  A£))hs|2 
=  11^(^)1118, 

which  is  the  desired  result.  □ 

Lemma  B.3.  Let  G  =  (V,  E )  be  a  circulant  graph  with  n  vertices  such  that  n  is  an 
odd  prime  and  cu:  V  — >  T.  For  any  fixed  e  >  0,  suppose  that 

n— 1 

||P5±(cuca*)|||s  +  —  {coco*,  <^<5*)hs|  <  £■ 

i= 0 

Then 

71—1  2  71—1  71—1  2 

+5Z|l  ~  (52XiEjujuJ*E~j'5i5i)ns  -£ 

i= 0  i=0  j= 0 

for  every  { A*  I^Zq1  C  M  such  that  A*  =  1. 
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Proof.  Suppose  {Ai}"^1  C  M  such  that  A j  =  1  and  let  Y  :=  J2i=o  A iEluu*E  1 . 

By  the  triangle  inequality  we  have 


n—  1 

IITVY'IIhs  <  ^Aill^x^W^-OllHS 

i= 0 

n— 1 

=  IIP^^W^-OIIhs^A,  =  \\Vs±(Eiujuj*E-i)\\}ls, 

i= 0 

and  so  applying  Lemma  B.2  yields  ||'Ps-lF||hs  <  ||Ps±(u;u;*)||hs-  Moreover,  for  any 
i  G  {0, . . . ,  n  —  1}  we  have 

n—  1 

<Y,(^*>hS  =  J]a,(£;W^,^*)hs 

3=0 
n—  1 

=  ^A>aA£-^i<p7J')HS 

j=o 

n— 1 

=  (caca*,  5*5* )hs  ^  Aj  =  (cau;*,  <5j<5*)Hs> 
j=o 

and  so  it  follows  that 

n—1 

IIJVIIhS  +  S|l  -  <U  *lS?>Hs|2 

i= 0 

n—1 

<  ||'Pg±(caca*)l||g  +  Y]1  -  (caca*,  ^£*)hs1  < 
i= o 

completing  the  proof.  □ 

Proof  of  Theorem  B.l.  Suppose  {A,;}”^1  C  M  such  that  'Yff'Y{]  K  =  1  and  let  Y  := 
\iElujuj*E~l.  Since  Y  is  a  convex  combination  of  solutions  to  the  feasibility 
problem  (50),  it  is  feasible  for  any  convex  relaxation  provided  the  norm  and  diagonal 
constraints  are  satisfied.  To  this  end,  note  that  Lemma  B.3  implies  that  ||'PgxT||HS  + 
ESf|l  —  (y,  )hs|2  <  £,  and  so  Y  is  feasible.  □ 
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To  reiterate,  it  is  not  clear  how  to  obtain  a  solution  to  the  feasibility  prob¬ 
lem  (50);  indeed,  properly  relaxing  it  to  a  convex  feasibility  problem  is  still  an  open 
problem.  Assuming  such  a  relaxation  exists,  recovering  the  desired  phase  vector  from 
a  solution  to  the  relaxed  version  is  possible.  To  see  this,  let  ca  be  a  given  vector  of 
unknown  phase  errors  and  suppose  the  matrix  Y  solves  a  convex  relaxation  of  (50) 
(without  noise)  for  some  £  >  0.  Then,  as  Theorem  B.l  indicates,  we  know  that 
Y  =  E~l  for  some  (A., C  M  such  that  X^t)1  ^  =  1-  order  to 

recover  the  actual  phase  vector  ca,  we  therefore  need  some  way  of  decomposing  the 
matrix  Y .  Notice  that  if  the  vectors  {El ca}”^1  are  mutually  orthogonal,  then  Y  is 
diagonalizable  by  the  spectral  theorem.  Such  a  diagonalization  would  have  the  effect 
of  arranging  normalized  versions  of  {E1u}™Iq  as  the  columns  of  some  matrix,  which 
we  could  then  analyze.  For  this  approach,  we  require  the  following  lemma: 

Lemma  B.4.  Let  ca :  Zn  — >  T  be  a  vector  of  unit-modulus  phases.  Then  the  set  of 
modulations  {^FAca}”^1  is  orthonormal. 

Proof.  For  any  i,  j  G  {0, . . . ,  n  —  1},  consider  the  inner-product 

n—  1 

(Ma,  Ma)HS  =  (ca,^-ka)Hs  =  Tr[(£J'-ka)*ca]  =  co*El~juj  =  e27ri^k/n\ujk\2. 

k= 0 

Since  |ca*|  =  1  for  all  k  G  {0, . . . ,  n  —  1},  the  geometric  sum  formula  then  yields 


(■ Elu ,  Ejuj)nS  — 


n 


e27Ti(j— i)_l 


if  i  =  j 

if  i  j , 


from  which  it  follows  that 


f  1  if  i  =  j 
(o  if  i^j. 


□ 
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For  the  matrix  Y  above,  the  result  of  Lemma  B.4  implies  the  spectral  decom¬ 
position 

n—  1 

Y  =  '}T\i(Eiou)(Eiu)*  =  V(nA)V*, 

i= 0 

where  V  is  the  matrix  whose  ith  column  is  the  vector  A=Elto  for  each  i  =  0, . . . ,  n  —  1 
and  A  is  the  diagonal  matrix  whose  entries  are  A[i,i]  =  Note  that  we  may 
arbitrarily  permute  the  columns  of  V  without  changing  the  matrix  Y .  Similarly, 
multiplying  each  column  by  an  arbitrary  phase  in  T  has  no  effect  on  Y,  and  so  the 
most  general  expression  for  a  solution  to  (50)  is  of  the  form  Y  =  (f/P'F)(nA)(l/P'F)* 
for  some  permutation  matrix  P  and  diagonal  matrix  of  phases  T. 

Now  consider  the  diagonal  matrix  W  defined  by  W[i,  *]  =  u>i  for  each  i  = 
1, . . .  ,n  —  1.  If  w  denotes  the  n- vector  whose  coordinates  are  all  1,  then  we  may 
decompose  V  in  terms  of  W  and  the  n-dimensional  inverse  DFT  matrix  P: 


V 


E°w  Elw 


En  1w 


WF. 


Hence,  columns  of  the  matrix  U  :=  bFFP'F  are  each  of  the  form  {^e2mze2mk^ntOk}kZ o 
for  some  z  G  M/Z,  i.e.,  the  columns  of  U  are  modulated  versions  of  u,  multiplied 
by  distinct  global  phases.  Since  modulation  and  global  phase  are  the  two  degrees 
of  freedom  associated  with  recovering  the  vector  to  (see  Lemma  5.8),  it  follows  that 
any  column  of  U  (properly  scaled)  yields  an  acceptable  estimate  for  to. 

In  the  presence  of  noise,  we  desire  a  more  stable  solution  than  simply  pulling 
any  column  from  U .  Indeed,  since  any  column  of  U  is  acceptable,  it  makes  sense 
that  each  column  would  be  equally  perturbed  by  noise.  To  leverage  this,  we  may 
attempt  to  “integrate”  over  all  columns  in  order  to  average  out  the  effect  of  noise. 
This  is  a  common  approach  to  stable  algorithms  and  is  motivated  by  the  Central 
Limit  Theorem.  One  such  way  of  doing  this  is  summarized  in  Algorithm  6. 
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Algorithm  6 

Input:  Matrix  solution  Y  =  U(nA)U*  to  a  relaxed  version  of  problem  (50) 
Output:  Vector  oo  of  phases  such  that  Y  =  ^'(Tq1  A iEluu*E~l  for  some  convex  sum 

E£o*<=  1 

Fix  an  initial  estimate  Co  •(—  \/nuo,  where  Uq  is  the  first  column  of  U 

Fix  a  threshold  £  >  0 

Initialize  an  n  x  n  matrix  P  of  zeros 

for  i,j  =  0  to  n  —  1  do 

if  |1  —  ^\{Elu,Uj)\\  <  e  then 

P[i,  j]  •<—  1  {Detect  the  permutation  matrix  which  orders  the 

columns  of  U  as  consecutive  modulations  of  ui} 

end  if 
end  for 

Compute  A  \/n(U P”1)  o  F*  {F*  denotes  the  n  x  n  DFT  matrix; 

o  is  the  Hadamard  matrix  product} 

Compute  the  singular  value  decomposition  QYR*  •(—  SVD[A] 

Compute  the  updated  estimate  Co  go,  where  go  is  the  first  column  of  Q 
Output:  oo  =  {Coi^CoiW^lQ 


This  algorithm  is  quite  natural  considering  the  above  discussion  in  which  we 
decomposed  the  matrix  Y .  Indeed,  since  U  =  VPT  for  some  permutation  matrix  P 
and  diagonal  matrix  T  of  phases,  where  V  —  WF  is  in  terms  of  the  n  x  n  inverse 
DFT,  the  matrix  A  in  Algorithm  6  approximates  the  outer  product  c oi/jt ,  where  i/j  is 
the  diagonal  of  T.  Thus,  the  leading  column  of  Q  in  the  singular  value  decomposition 
is  the  desired  vector  of  phase  errors  from  the  best  low  rank  approximation  for  A. 
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Appendix  C. 


In  the  spirit  of  reproducible  research  [45],  we  provide  here  the  code  used  to  run 
simulations  in  Chapter  V.  All  data  was  acquired  using  versions  of  the  code  below. 


%%  Iterative  Angular  Synchronization 

%  Determines  a  phase  error  vector  (up  to  global  phase  and  modulation) 

%  from  nested  relative  phase  information.  The  first  level  of 

%  relative  phase  is  encoded  as  edge  measurements  in  a  graph  G;  the 

%  second  level  is  encoded  as  the  edge  measurements  in  a  related  graph 

%  G"  prime.  The  vertex  measurements  in  G"  prime  are  determined,  which 

%  then  become  the  edge  measurements  in  G.  The  resultant  vertex 

%  measurements  in  G  are  the  entries  of  the  phase  error  vector  . 

%  Based  on  angular  synchronization  ,  Amit  Singer  (2011)  . 


n  =  101; 

c  =  10; 


noise  =  0.1; 


%  Specify  number  of  vertices  in  G  (must  be  prime  >2; 
%  this  ensures  G  decomposes  into  n— cycles) 

%  Specify  number  of  n— cycles  in  the  decomposition  of 
%  G  (must  be  <n/2  so  that  G  can  be  directed) 

%  Specify  magnitude  of  adversarial  noise  (must  be  in 
%  [0,1]);  0  is  no  noise,  1  is  full  N(0,1)  noise 


first  columndone  =  0; 
while  first  columndone  =  0 
r  =  zeros ( 1 , n) ; 
indices  =  randper m ( n  —  1 )  ; 
for  k  =  2  :  n 

i f  indices  (k  — 1)  <=  c 
r(k)  =  1; 

end 

end 

revr  =  zeros ( n , 1 )  ; 
revr(l)  =  r(l); 
for  k  =  2  :  n 

revr(k)  =  r(n— k+2)  ; 

end 

i  f  r*  revr  =  0 

firstcolumndone  =  1; 

end 

end 


%  Generates  a  random  vector  r  of  0s 
%  and  Is  which  will  generate  the 
%  adjacency  matrix  of  G;  r  will 
%  have  c  nonzero  entries  ,  each 
%  corresponding  to  a  distinct  cycle 

%  in  G 


%  Ensures  r  will  generate  a  directed 
%  graph  G 


1 

3 

5 

7 

9 

11 

13 

15 

17 

19 

21 

23 

25 

27 

29 

31 

33 

35 

37 
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I  =  eye (n) ;  39 

P  =  zeros (n) ; 

A  =  zeros (n) ;  41 


for  k  =  1  :  n  %  Builds  the  permutation  matrix  43 

i f  k  <  n  %  used  to  generate  the 

P(:,k)  =P(:,k)  +  I  ( :  ,  k+1)  ;  %  adjacency  matrix  of  G  45 

else 

P ( : ,k)  =  P ( : ,k)  +  I ( : ,1) ;  47 

end 

end  49 


for  k  =  1  :  n 

A( :  ,  k)  =  A  ( :  ,  k)  +P~(k-l)*r 

end 


%  Generates  the  adjacency  matrix 
%  A  of  G;  A  is  circulant  by 
%  construction 


51 

53 


Ac  =  zeros (n , n , c ) ;  55 


count  =  0;  57 

rc  =  zeros  (1  ,n)  ; 

for  k  =  2  :  n  %  Generates  the  adjacency  matrices  of  each  59 

i f  r(k)  =  1  %  cycle  in  G;  the  kth  cycle  adjacency 

count  =  count  +  1;  %  matrix  is  stored  as  Ac  ( :  ,k)  61 

rc  ( k)  =  1 ; 

for  j  =  1  :  n  63 

Ac  ( :  ,j  , count)  =  Ac  ( :  ,j  , count)  +  P~(j— l)*rc 
end  65 

rc  =  0*  rc  ; 

end  67 


end 

69 

r  1  =  rand ( n , 1 )  ; 


omegal  =  exp  (2*pi*li*rl)  ;  %  Generates  a  random  phase  error  vector;  71 

%  this  is  the  true  vertex  data  for  G 

exact  =  omegal *omegal  ’ ;  73 

Xc  =  zeros (n , n , c ) ;  75 


for  k  =  1  :  c  %  Generates  the  true  weighted  77 

Xc  ( :  ,k)  =  Ac  ( :  ,k).*exact;  %  adjacency  matrices  for  each 

end  %  cycle  in  G  79 
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sigma 


zeros ( n , c )  ; 


81 


for  k  =  1  :  c 

for  j  =  1  :  n 

for  m  =  1  :  n 

i  f  Xc(j  ,m,  k)  '=  0 

sigma(j  ,k)  =  Xc(j  ,m,k)  ; 

end 

end 

end 

end 


% 

Lines  85  —  93  pull 

the  nonzero 

83 

% 

entries  from 

each  Xc  and 

% 

organize  them 

in  vector 

85 

% 

form ;  for  the 

kth  cycle  , 

% 

sigma  ( :  ,  k)  is 

the  nonzero 

87 

% 

data  from  Xc  ( 

:  ,k) 

89 

91 

Sigma  =  zeros  (n  ,  c )  ; 


93 


for  k  =  1  :  c  %  Reorders  the  nonzero  data  from  each 


s  h  i  f  t  =  0 ;  % 

x  —  0 ;  % 

while  shift  =  0  % 

x  =  x  +  1;  % 

if  Ac (1  , x  ,  k )  =  1 ;  % 

shift  =  x  — 1;  % 

end  % 

end  % 

for  j  =  1  :  n 


Sigma(j  ,k)  =  sigma(l+mod  ( ( j  — 1) 

end 

end 


Xc  so  that  consecutive  entries 
come  from  consecutive  edges  in 
each  cycle  of  G^prime;  the  kth 
column  of  Sigma  represents  the 
true  edge  measurements  for  the  kth 
cycle;  this  data  is  the  received 
relative  phases  uncorrupted  by 
noise 

shift  ,  n)  ,k)  ; 


95 

97 

99 

101 

103 

105 

107 


Sigmal  =  zeros  (n  ,  c )  ;  109 

for  k  =  1  :  c  111 

Sigmal(:,k)  =  Sigma(:,k)  +  noise  *  (randn  (n  ,  1 )  +1  i  *randn  (n  ,  1 )  )  ; 

Sigmal(:,k)  =  Sigmal  (:,  k)  ./ abs(  Sigmal  (:,  k)  )  ;  113 

end  %  Adds  N(0,1)  noise  to  the  input  signal  according  to 

%  the  predefined  parameter  noise  115 

data  =  zeros (n , n , c ) ;  117 

for  k  =  1  :  c  %  Creates  the  weighted  adjacency  matrices  for  each  119 

for  j  =  1  :  n  %  cycle  in  G^prime;  nested  relative  phases  for  the 

i f  j  <  n  %  kth  cycle  are  stored  in  data(:  ,k)  121 

data(j,j+l,k)  =  Sigmal  ( j  ,  k)  *  coni  (  Sigmal  ( j  +1  ,k)  )  ; 
else  123 
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data(j,l,k)  =  Sigma  1  ( j  ,  k)  *conj  (  Sigmal  (1  ,  k)  )  ; 

end  125 

end 

end  127 

129 

% 

%  From  this  point  forward  the  algorithm  is  reconstructing  the  phase  error  131 

%  vector  from  noise—  corrupted  ,  nested  relative  phase  information;  in 

%  practice  ,  this  is  where  real  data  will  enter  the  algorithm  133 

% 

135 

sigmal  =  zeros  (n ,  c )  ;  137 


for  k  =  1  :  c  %  Angular  synchronization  ;  the  139 

[V,  ~  ]  =  eig  (  data  ( :  , :  ,  k)+data  (  :  , :  ,  k)  ’)  ;  %  columns  of  sigmal  are 

sigmal  (:,k)  =  V( :  ,  n)  .  /  abs  (V  ( :  ,  n)  )  ;  %  estimates  of  corresponding  141 

end  %  columns  of  Sigma  (up  to 

%  distinct  global  phases)  143 

%  To  check  accuracy  ,  note  that  145 

%  (sigmal  (:  ,k)*sigmal  (:  ,k)  ’)— (Sigma(:  ,k)*Sigma(:  ,k)  ’) 

%  should  be  a  matrix  of  zeros  for  each  k  (without  noise)  147 

sigma2  =  zeros  (n  ,  c )  ;  149 


for  k  =  1  :  c  %  Reorders  the  input  data  such  that  151 

shift  =  0;  %  consecutive  entries  correspond  to 

x  =  0;  %  consecutive  rows  in  the  weighted  153 

while  shift  =  0  %  adjacency  matrices  of  each  cycle 

x  =  x  +  1;  %  G^prime;  note  this  is  the  inverse  155 

i f  Ac(l,x,k)  =  1;  %  process  to  that  performed  above 

s  h  i  ft  =  x—  1;  157 

end 

end  159 

for  j  =  1  :  n 

sigma2  (1+mod  ((j— l)*shift,n),k)  =  sigmal(j,k);  161 

end 

end  163 

165 

estimate  =  zeros  (n  ,  n  ,  c )  ; 


141 


167 


for  k  =  1  :  c 


estimate  (:,:,k)  =  diag  (  sigma2  ( :  ,  k)  )  * 

end  %  Generates  weighted 

Ac  ( :  , :  , k )  ; 

adjacency  matricesfor 

each  cycle 

169 

%  in  G;  the 

kth  c 

ycle  is  stored  as 

estimate  ( :  , :  ,  k) 

171 

s  =  zeros ( c )  ; 

173 

countl  =  0; 

175 

count2  =  0; 

177 

for  k  =  2  :  n 

% 

Generates  a  matrix 

s  of 

exponents  ; 

j if  r  ( k)  =  1 

% 

exponent  s (k , j ) 

represents  the 

179 

countl  =  countl  +  1; 

% 

power  such  that 

estimate  ( :  , :  ,  k) 

for  j  =  2  :  n 

% 

and  estimate  ( :  , 

:  >  j ) " 

s(k,  j  ) 

181 

If  r  ( j  )  =  1 

% 

have  common  support  ; 

note  that 

count2  =  count2  + 

1;  % 

s  (k,  j  )  s  ( j  ,  k)=l(mod  n 

) 

183 

for  m  =  1  :  n 

i  f  mod  ( ( k  —  1 )  *  (m—  1 )  ,  n 

)  =  1 

185 

inverse  =  m—  1 ; 

end 

187 

end 

s ( count2 , count 1 )  = 

mod  ((j  —  l)*inverse  ,n); 

189 

end 

end 

191 

count2  =  0; 

end 

193 

end 

195 

%  Below  (commented  out)  is  an  alternative  way  of  computing 

the 

matrix  s 

197 

%  si  =  zeros(c); 

% 

199 

%  for  k  =  1  :  c 

%  for  j  =  1  :  c 

201 

%  for  power  =  1  :  n 

%  i f  Ac ( :  , :  ,  j ) *  power  = 

%  sl(k,j)  =  power; 

=  Ac  (  : 

,k) 

203 

%  end 

205 

%  end 

%  end 

207 

%  end 

209 
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alpha  =  zeros ( c ) ; 

211 

for  k  =  1  :  c 

for  j  =  1  :  c  213 

sumsum  =  sum  (sum  (  estimate(:  ,j)~(s(k,j)).  *  coni  (  estimate(:  ,  k )  ) )  )  ; 

alpha  (k,j)  =  sumsum/abs  (sumsum)  ;  215 

end  %  Computes  phases  such  that 

end  %  alpha(k,j)estimate(:,:,k)  =  estimate  (:,:,j)"s(k,j)  217 

%  for  all  k  ,  j  =  1  , . . .  ,  c 

219 

%  To  check  accuracy ,  note  that 

%  alpha(k,j)*estimate(:  ,k)— estimate  (:  ,j)"(s(k,j))  221 

%  should  be  a  matrix  of  zeros  for  each  pair  (k,j)  (without  noise) 

223 

%  225 

%  At  this  point  ,  note  that  we  have  the  edge  measurements  for  each  cycle  in 
%  G  up  to  distinct  global  phase  factors  ;  to  reconcile  these  global  227 

%  phases ,  observe  that  in  the  noiseless  case  we  have 

%  alpha(k,j)estimate(:,:,k)  =  estimate(:,:,j)"s(k,j)  229 

%  and  so,  letting  Estimate  (:  ,k)  denote  the  actual  weighted  adjacency 

%  matrices  ,  there  exist  phase  factors  beta(k)  such  that  231 

%  alpha  (k  ,  j  )  Estimate  ( :  ,k)/beta(k) 

%  =  (Estimate(:  ,  j  )  /  beta  ( j  )  )  "  s  (k ,  j  )  233 

%  from  which  equality  in  the  noiseless  case  implies  that 

%  alpha  (k  ,  j  ) /beta  (k)  =  1/ beta  ( j  )  "  s  (k  ,  j  )  235 

%  we  then  immediately  have  the  relation 

%  beta(k)  =  alpha (k , 1 ) beta ( 1 )" s (k , 1 )  237 

% 

239 

beta  =  zeros ( c , 1 ) ;  241 


matrix  =  (estimate  (:,:,1)  =  0)  +  estimate  (:,:,1);  243 

number  =  prod  ( prod  (  matrix  )  )  ; 

beta  ( 1 )  =  number"(  —  1/n)  ;  %  Fixes  beta(l)  to  be  the  inverse  of  the  245 

%  product  of  the  phases  in  estimate  (:  ,1) 

247 

for  k  =  2  :  c 

beta  (k)  =  alpha  (k  ,  1 )  *beta(  1 )"(  s  (k  ,  1 ))  ;  249 

end  %  Populates  the  phase  vector  beta  based  on  alphas 

%  and  relationship  above  using  beta(l)  251 
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XcEst  =  zeros (n , n , c ) ; 


253 


for  k  =  1  :  c 

255 

XcEst  (:,:,k)  =  estimate  (:, 

:  , k) * ( coni  ( beta ( k) ) )  ; 

end  %  Generates  the  weighted  adjacency  matrices  of  the  cyles  in  G 

257 

%  such  that  edge 

measurements  in  all  cycles  are  estimated  up 

%  to  a  common  global  phase  factor 

259 

A1  =  zeros ( n) ; 

261 

for  k  =  1  :  c 

%  Generates  the 

weighted  adjacency  matrix 

263 

A1  =  Al  +  XcEst  ( :  , :  ,  k)  ; 

%  Al  of  G  (up  to  a  global  phase) 

end 

265 

lead  =  1; 

267 

leader  =  0; 

for  k  =  1  :  n 

% 

Eigenvector  method  for 

269 

psi  =  exp (2*pi*li*k/n) ; 

% 

angular  synchronization 

H  =  psi  *A1 ; 

% 

peformed  for  each  phase 

271 

[V,-]  =  eig(H+H’)  ; 

% 

factor  psi 

v  =  abs  (V  ( :  ,n))  —  sqrtfl/n 

) *  ones ( n , 1 )  ; 

273 

discrepancy  =  max( abs  ( v) )  ; 

% 

Determines  the  phase  psi 

i f  discrepancy  <  lead 

% 

such  that  the  leading 

275 

lead  =  discrepancy  ; 

% 

eigenvector  of  H+H ’  i s 

leader  =  k; 

% 

closest  to  unimodular 

277 

end 

end 

279 

psi  =  exp (2*pi*li*leader/n) ; 

%  Angular  s 

ynchronizat  ion  using  the 

281 

H  =  psi  *  Al ; 

%  optimal  phase  psi  generated  above 

[V,-]  =  eiK(H+H’)  ; 

283 

omega  =  V  ( :  ,  n )  .  /  abs  ( V  ( :  ,  n )  )  ; 

%  Generates 

the  estimated  phase  error 

285 

%  vector 

omega 

287 

%  Last  edited  :  17  Feb  2014 

289 

%  Edited  by:  Aaron  A.  Nelson 
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(ff%  Modulation  and  Global  Phase  Detection 

%  Determines  the  best  moduation  and  phase  factor  for  reconstructing  an  2 

%  image  from  multistatic  SAR  data  with  phase  errors  known  up  to  a 

%  global  phase  and  modulation  .  4 

6 

% 

%  Before  running  this  code,  load  a  matrix  X  of  a  SAR  image.  8 

% 

10 

X=im2double(X(:,:,l));  12 

[width  height]  =  size  (X)  ; 

14 

n  =  101;  %  Specify  size  of  partition  of  the  Fourier 

%  domain;  this  should  correspond  to  the  16 

%  number  measurements  taken  (  i  .  e  .  ,  the 

%  length  of  the  phase  error  vector)  and  18 

%  must  be  prime  >2  to  correspond  to  usable 

%  circulant  graphs  20 

noise  =  0.1;  %  Specify  magnitude  of  noise  (must  be  in  [0  ,1])  ; 

%  0  is  no  noise  ,  1  is  full  complex  N(0,1)  noise  22 

mask  =  zeros( width  ,  height )  ;  24 


for  k  =  1  :  width  %  Partitions  Fourier  domain  into  n  sections  26 

i f  k  <=  width/2  %  by  angle;  each  section  corresponds  to 

x  =  k— 1/2;  %  a  slice  of  the  Fourier  transform  data  28 

else  %  that  is  available  from  multistatic  SAR 

x  =  k— width  — 1/2;  30 

end 

for  j  =  1  :  height  32 

i f  j  <=  height /2 

y  =  j  -1/2;  34 

else 

y  =  j— height  —1/2;  36 

end 

angle  =  atan  (y/x)  ;  38 

mask(k,j)  =  floor  ( (  angle+pi  /  2)*n/pi)  +  1; 

end  40 

end 

42 
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I  =  eye ( n ) ; 


44 


randphases  =  ones(n,l)  +  n  o  i  s  e  *  ( randn  ( n  ,  1 )  +1  i  *  randn  ( n  ,  1 )  )  ; 

randphases  =  randphases  .  /  abs  (  randphases  )  ;  46 

%  Generates  random  phases  close  to  1  with  noise  proportional  to 

%  predefined  parameter  noise  (complex  N(0,1)  ,  centered  at  1)  48 


globalphase  =  randn  (1)  +1  i  *  randn  ( 1 )  ;  %  Generates  a  random  global 

globalphase  =  globalphase /abs  (  globalphase  )  ;  %  phase 

truemodulation  =  ce i  1  ( rand*n)  ;  %  Generates  a  random  modulation  index 

randphases  =  globalphase  *  randphases  .  *_fft_(  I  (:,  truemodulation  ))  ; 

%  Simulates  the  (noisy)  phase  error  vector  ,  known  up  to  a 
%  global  phase  and  modulation 

X  =  circshift(X,  [  floor  ( width /2)  ,  floor  (  height  / 2 )  ] )  ; 

Y  =  fft  2  (X)  ;  %  Centers  the  image  around  the  ’’origin” 

%  and  takes  the  2D  Fourier  transform 

Ynoisy  =  Y.  *  randphases  (mask)  ; 

%  Applies  noisy  phases  to  each  slice  of  the  image  in  the  Fourier 
%  domain;  this  simulates  the  input  data  from  multistatic  SAR 


50 

52 

54 

56 

58 

60 

62 

64 


66 

% 

%  From  this  point  forward  the  algorithm  is  recovering  the  modulation  and  68 

%  global  phase  factor  from  simulated  multistatic  SAR  data  constructed 

%  above;  this  is  where  real  data  will  enter  the  algorithm  70 

% 

72 


tvnorms  =  zeros  (n  ,  1 )  ;  74 


for  modulation  =  1  :  n  %  Determines  an  estimate  for  76 

phases  =  coni  (  f  ft  ( I  (:,  modulation)));  %  the  modulation  by 

Ymodulated  =  Ynoisy  .*  phases  (mask)  ;  %  maximizing  total  variation  78 

Xhat  =  ifft2  (Ymodulated)  ;  %  in  the  image  over  all 

Xhat  =  abs  ( Xhat )  ;  %  possible  modulations  80 

tvnorms  (  modulation  )  .  .  . 

=  sum  ( sum  ( abs  ( Xhat—  circshift  (Xhat  ,[1,0]))))...  82 

+  sum ( sum ( abs  ( Xhat— c  i r  c s  h  i  f t  (Xhat  ,[0,1])))); 
end  %  Note  this  implicitly  ignores  the  global  phase  factor  84 
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86 


[value  modulationest  imate  ]  =  max(tvnorms)  ; 

%  Determines  the  modulation  index  that  maximizes  the  total  variation 

phases  =  conj  (  fft  (I  (:  ,  modulationestimate)))  ;  %  Computes  the  estimated 

Ymodulated  =  Ynoisy  .*  phases  ( mask)  ;  %  image  with  the  estimate 

Xhat  =  ifft2  (Ymodulated)  ;  %  for  the  modulation 

globalphaseest imate  =  sum ( sum ( Xhat ))  ; 

globalphaseest  imate  =  globalphaseest  imate /abs(  globalphaseest  imate  )  ; 

%  Determines  an  estimate  for  the  global  phase  factor  ;  takes  the 
%  estimate  to  be  the  average  phase  of  all  data  in  the 
%  estimated  image 

Xhat  =  real  ( Xhat* coni  (globalphaseestimate))  ; 

%  The  final  estimate  for  the  image,  with  the  estimated  modulation 
%  and  global  phase 

globalphaseer  ror  =  abs(  globalphaseest  imate  — globalphase  )  ; 
relativeerror  =  norm  ( Xhat— X) /norm  (X)  ; 


88 

90 

92 

94 

96 

98 

100 

102 

104 


106 

%  Last  edited  :  18  Feb  2014 
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